kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
"Am I listening?", Evaluating the Quality of Generated Data-driven Listening Motion
IDLab-AIRO -Ghent University Ghent, Belgium.
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.ORCID iD: 0000-0002-1643-1054
IDLab-AIRO -Ghent University Ghent, Belgium.
2023 (English)In: ICMI 2023 Companion: Companion Publication of the 25th International Conference on Multimodal Interaction, Association for Computing Machinery (ACM) , 2023, p. 6-10Conference paper, Published paper (Refereed)
Abstract [en]

This paper asks if recent models for generating co-speech gesticulation also may learn to exhibit listening behaviour as well. We consider two models from recent gesture-generation challenges and train them on a dataset of audio and 3D motion capture from dyadic conversations. One model is driven by information from both sides of the conversation, whereas the other only uses the character's own speech. Several user studies are performed to assess the motion generated when the character is speaking actively, versus when the character is the listener in the conversation. We find that participants are reliably able to discern motion associated with listening, whether from motion capture or generated by the models. Both models are thus able to produce distinctive listening behaviour, even though only one model is truly a listener, in the sense that it has access to information from the other party in the conversation. Additional experiments on both natural and model-generated motion finds motion associated with listening to be rated as less human-like than motion associated with active speaking.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM) , 2023. p. 6-10
Keywords [en]
embodied conversational agents, listening behaviour
National Category
Natural Language Processing
Identifiers
URN: urn:nbn:se:kth:diva-339688DOI: 10.1145/3610661.3617160Scopus ID: 2-s2.0-85175853253OAI: oai:DiVA.org:kth-339688DiVA, id: diva2:1812471
Conference
25th International Conference on Multimodal Interaction, ICMI 2023 Companion, Paris, France, Oct 9 2023 - Oct 13 2023
Note

Part of ISBN 9798400703218

QC 20231116

Available from: 2023-11-16 Created: 2023-11-16 Last updated: 2025-02-07Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Henter, Gustav Eje

Search in DiVA

By author/editor
Henter, Gustav Eje
By organisation
Speech, Music and Hearing, TMH
Natural Language Processing

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 74 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf