kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Evaluation of generative models for emotional 3D animation generation in VR
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST).ORCID iD: 0000-0002-7414-845X
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST).ORCID iD: 0000-0003-1206-5701
KTH, School of Electrical Engineering and Computer Science (EECS), Human Centered Technology, Media Technology and Interaction Design, MID.ORCID iD: 0000-0002-6571-0623
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST).ORCID iD: 0000-0002-7257-0761
2025 (English)In: Frontiers in Computer Science, E-ISSN 2624-9898, Vol. 7, article id 1598099Article in journal (Refereed) Published
Abstract [en]

Introduction: Social interactions incorporate various nonverbal signals to convey emotions alongside speech, including facial expressions and body gestures. Generative models have demonstrated promising results in creating full-body nonverbal animations synchronized with speech; however, evaluations using statistical metrics in 2D settings fail to fully capture user-perceived emotions, limiting our understanding of the effectiveness of these models. Methods: To address this, we evaluate emotional 3D animation generative models within an immersive Virtual Reality (VR) environment, emphasizing user—centric metrics-emotional arousal realism, naturalness, enjoyment, diversity, and interaction quality—in a real-time human-agent interaction scenario. Through a user study (N = 48), we systematically examine perceived emotional quality for three state-of-the-art speech-driven 3D animation methods across two specific emotions: happiness (high arousal) and neutral (mid arousal). Additionally, we compare these generative models against real human expressions obtained via a reconstruction-based method to assess both their strengths and limitations and how closely they replicate real human facial and body expressions. Results: Our results demonstrate that methods explicitly modeling emotions lead to higher recognition accuracy compared to those focusing solely on speech-driven synchrony. Users rated the realism and naturalness of happy animations significantly higher than those of neutral animations, highlighting the limitations of current generative models in handling subtle emotional states. Discussion: Generative models underperformed compared to reconstruction-based methods in facial expression quality, and all methods received relatively low ratings for animation enjoyment and interaction quality, emphasizing the importance of incorporating user-centric evaluations into generative model development. Finally, participants positively recognized animation diversity across all generative models.

Place, publisher, year, edition, pages
Frontiers Media SA , 2025. Vol. 7, article id 1598099
Keywords [en]
3D emotional animation, generative models, nonverbal communication, user-centric evaluation, virtual reality
National Category
Human Computer Interaction Computer Sciences
Identifiers
URN: urn:nbn:se:kth:diva-369923DOI: 10.3389/fcomp.2025.1598099ISI: 001549678200001Scopus ID: 2-s2.0-105013367950OAI: oai:DiVA.org:kth-369923DiVA, id: diva2:1999042
Note

QC 20250918

Available from: 2025-09-18 Created: 2025-09-18 Last updated: 2025-09-18Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Chhatre, KiranGuarese, RenanMatviienko, AndriiPeters, Christopher

Search in DiVA

By author/editor
Chhatre, KiranGuarese, RenanMatviienko, AndriiPeters, Christopher
By organisation
Computational Science and Technology (CST)Media Technology and Interaction Design, MID
In the same journal
Frontiers in Computer Science
Human Computer InteractionComputer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 81 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf