Multimodal emotion recognition system for spontaneous vocal and facial signals: SMERFS
Date of Publication
2010
Document Type
Bachelor's Thesis
Degree Name
Bachelor of Science in Computer Science
College
College of Computer Studies
Department/Unit
Computer Science
Thesis Adviser
Jocelynn W. Cu
Defense Panel Member
Clement Y. Ong
Merlin Teodosia C. Suarez
Francis P. Lai
Abstract/Summary
Human computer interaction is moving towards giving computers the ability to adapt and give feedback in accordance to a user's emotion. Initial researches on multimodal emotion recognition shows that combining both vocal and facial signals performed better compared to using physiological signals. In addition, majority of the emotion corpus used on both unimodal and multimodal systems were modeled based on acted data using actors that tend to exaggerate emotions. This study improves the accuracy of single modality systems by developing a multimodal emotion recognition system through vocal and facial expressions using a spontaneous emotion corpus. FilMED2, which contains spontaneous television clips from reality television shows, is the corpus used in this study. The clips contain discrete emotion labels where they are only labeled as happiness, sadness, anger, fear and neutral. The system makes use of the facial feature points and prosodic features which include pitch and energy that will undergo machine learning for classification. SVM is the machine learning technique used for classification and was first tested on each modality for both acted and spontaneous corpus. The acted corpus yielded higher results as compared to when using the spontaneous corpus for both modalities. Both modalities were then combined using decision-level fusion. Using solely the face gave 60% accuracy while using solely the voice gave 32% accuracy. Combining both results with a weight-distribution of 75% face and 25% voice gave an accuracy rate of 80%.
Abstract Format
html
Language
English
Format
Accession Number
TU15561
Shelf Location
Archives, The Learning Commons, 12F, Henry Sy Sr. Hall
Physical Description
1 v. (various foliations) : ill. (some col.) ; 28 cm.
Keywords
Human-computer interaction; Pattern recognition systems; Computer vision; Artificial intelligence
Recommended Citation
Dy, M. C., Espinoza, I. L., Go, P. V., & Mendez, C. M. (2010). Multimodal emotion recognition system for spontaneous vocal and facial signals: SMERFS. Retrieved from https://animorepository.dlsu.edu.ph/etd_bachelors/14653