Emotion recognition has become an important task of modern human-computer interac- tion. A multilayer boosted HMM ( MBHMM ) classifier for automatic audio-visual emotion recognition is presented in this paper. A modified Baum-Welch algorithm is proposed for component HMM learn- ing and adaptive boosting (AdaBoost) is used to train ensemble classifiers for different layers (cues). Except for the first layer, the initial weights of training samples in current layer are decided by recognition results of the ensemble classifier in the upper layer. Thus the training procedure using current cue can focus more on the difficult samples according to the previous cue. Our MBHMM clas- sifier is combined by these ensemble classifiers and takes advantage of the complementary informa- tion from multiple cues and modalities. Experimental results on audio-visual emotion data collected in Wizard of Oz scenarios and labeled under two types of emotion category sets demonstrate that our approach is effective and promising.
A multimodal fusion classifier is presented based on neural networks (NNs) learned with hints for automatic spontaneous affect recognition. In case that different channels can provide com- plementary information, features are utilized from four behavioral cues: frontal-view facial expres- sion, profile-view facial expression, shoulder movement, and vocalization (audio). NNs are used in both single cue processing and multimodal fusion. Coarse categories and quadrants in the activation- evaluation dimensional space are utilized respectively as the heuristic information (hints) of NNs during training, aiming at recognition of basic emotions. With the aid of hints, the weights in NNs could learn optimal feature groupings and the subtlety and complexity of spontaneous affective states could be better modeled. The proposed method requires low computation effort and reaches high recognition accuracy, even if the training data is insufficient. Experiment results on the Semaine nat- uralistic dataset demonstrate that our method is effective and promising.