Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
The GENEA Challenge 2023: A large-scale evaluation of gesture generation models in monadic and dyadic setings
KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Robotik, perception och lärande, RPL. SEED Elect Arts EA, Stockholm, Sweden..ORCID-id: 0000-0001-9838-8848
KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Tal, musik och hörsel, TMH.ORCID-id: 0000-0002-9653-6699
ETRI, Daejeon, South Korea..
Sorbonne Univ, ISIR, Paris, France..
Vise andre og tillknytning
2023 (engelsk)Inngår i: Proceedings Of The 25Th International Conference On Multimodal Interaction, Icmi 2023, Association for Computing Machinery (ACM) , 2023, s. 792-801Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

This paper reports on the GENEA Challenge 2023, in which participating teams built speech-driven gesture-generation systems using the same speech and motion dataset, followed by a joint evaluation. This year's challenge provided data on both sides of a dyadic interaction, allowing teams to generate full-body motion for an agent given its speech (text and audio) and the speech and motion of the interlocutor. We evaluated 12 submissions and 2 baselines together with held-out motion-capture data in several large-scale user studies. The studies focused on three aspects: 1) the human-likeness of the motion, 2) the appropriateness of the motion for the agent's own speech whilst controlling for the human-likeness of the motion, and 3) the appropriateness of the motion for the behaviour of the interlocutor in the interaction, using a setup that controls for both the human-likeness of the motion and the agent's own speech. We found a large span in human-likeness between challenge submissions, with a few systems rated close to human mocap. Appropriateness seems far from being solved, with most submissions performing in a narrow range slightly above chance, far behind natural motion. The efect of the interlocutor is even more subtle, with submitted systems at best performing barely above chance. Interestingly, a dyadic system being highly appropriate for agent speech does not necessarily imply high appropriateness for the interlocutor. Additional material is available via the project website at svito-zar.github.io/GENEAchallenge2023/.

sted, utgiver, år, opplag, sider
Association for Computing Machinery (ACM) , 2023. s. 792-801
Emneord [en]
gesture generation, embodied conversational agents, evaluation paradigms, dyadic interaction, interlocutor awareness
HSV kategori
Identifikatorer
URN: urn:nbn:se:kth:diva-343599DOI: 10.1145/3577190.3616120ISI: 001147764700098Scopus ID: 2-s2.0-85170511127ISBN: 9798400700552 (tryckt)OAI: oai:DiVA.org:kth-343599DiVA, id: diva2:1840355
Konferanse
25th International Conference on Multimodal Interaction (ICMI), OCT 09-13, 2023, Sorbonne Univ, Paris, France.
Merknad

Part of ISBN: 979-840070055-2

QC 20240223

Tilgjengelig fra: 2024-02-23 Laget: 2024-02-23 Sist oppdatert: 2024-02-26bibliografisk kontrollert

Open Access i DiVA

Fulltekst mangler i DiVA

Andre lenker

Forlagets fulltekstScopus

Person

Kucherenko, TarasNagy, RajmundHenter, Gustav Eje

Søk i DiVA

Av forfatter/redaktør
Kucherenko, TarasNagy, RajmundHenter, Gustav Eje
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric

doi
isbn
urn-nbn
Totalt: 24 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf