kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Comparing supervised and unsupervised approaches to multimodal emotion recognition
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS. Karolinska Inst, Dept Learning Informat Management & Eth LIME, Stockholm, Sweden..ORCID iD: 0000-0001-7949-1815
Stockholm Univ, Dept Psychol, Stockholm, Sweden..
2021 (English)In: PeerJ Computer Science, E-ISSN 2376-5992, Vol. 7, article id e804Article in journal (Refereed) Published
Abstract [en]

We investigated emotion classification from brief video recordings from the GEMEP database wherein actors portrayed 18 emotions. Vocal features consisted of acoustic parameters related to frequency, intensity, spectral distribution, and durations. Facial features consisted of facial action units. We first performed a series of personindependent supervised classification experiments. Best performance (AUC = 0.88) was obtained by merging the output from the best unimodal vocal (Elastic Net, AUC = 0.82) and facial (Random Forest, AUC = 0.80) classifiers using a late fusion approach and the product rule method. All 18 emotions were recognized with above-chance recall, although recognition rates varied widely across emotions (e.g., high for amusement, anger, and disgust; and low for shame). Multimodal feature patterns for each emotion are described in terms of the vocal and facial features that contributed most to classifier performance. Next, a series of exploratory unsupervised classification experiments were performed to gain more insight into how emotion expressions are organized. Solutions from traditional clustering techniques were interpreted using decision trees in order to explore which features underlie clustering. Another approach utilized various dimensionality reduction techniques paired with inspection of data visualizations. Unsupervised methods did not cluster stimuli in terms of emotion categories, but several explanatory patterns were observed. Some could be interpreted in terms of valence and arousal, but actor and gender specific aspects also contributed to clustering. Identifying explanatory patterns holds great potential as a meta-heuristic when unsupervised methods are used in complex classification tasks.

Place, publisher, year, edition, pages
PeerJ , 2021. Vol. 7, article id e804
Keywords [en]
Affective computing, Facial expression, Multimodal emotion recognition, Supervised and unsupervised learning, Vocal expression
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:kth:diva-307344DOI: 10.7717/peerj-cs.804ISI: 000737129900001PubMedID: 35036530Scopus ID: 2-s2.0-85124043799OAI: oai:DiVA.org:kth-307344DiVA, id: diva2:1632178
Note

QC 20220126

Available from: 2022-01-26 Created: 2022-01-26 Last updated: 2022-06-25Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMedScopus

Authority records

Carbonell, Marcos FernandezBoman, Magnus

Search in DiVA

By author/editor
Carbonell, Marcos FernandezBoman, Magnus
By organisation
Software and Computer systems, SCS
In the same journal
PeerJ Computer Science
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 209 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf