kth.sePublikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
The GENEA Challenge 2023: A large-scale evaluation of gesture generation models in monadic and dyadic setings
KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Robotik, perception och lärande, RPL. SEED Elect Arts EA, Stockholm, Sweden..ORCID-id: 0000-0001-9838-8848
KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Tal, musik och hörsel, TMH.ORCID-id: 0000-0002-9653-6699
ETRI, Daejeon, South Korea..
Sorbonne Univ, ISIR, Paris, France..
Visa övriga samt affilieringar
2023 (Engelska)Ingår i: Proceedings Of The 25Th International Conference On Multimodal Interaction, Icmi 2023, Association for Computing Machinery (ACM) , 2023, s. 792-801Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

This paper reports on the GENEA Challenge 2023, in which participating teams built speech-driven gesture-generation systems using the same speech and motion dataset, followed by a joint evaluation. This year's challenge provided data on both sides of a dyadic interaction, allowing teams to generate full-body motion for an agent given its speech (text and audio) and the speech and motion of the interlocutor. We evaluated 12 submissions and 2 baselines together with held-out motion-capture data in several large-scale user studies. The studies focused on three aspects: 1) the human-likeness of the motion, 2) the appropriateness of the motion for the agent's own speech whilst controlling for the human-likeness of the motion, and 3) the appropriateness of the motion for the behaviour of the interlocutor in the interaction, using a setup that controls for both the human-likeness of the motion and the agent's own speech. We found a large span in human-likeness between challenge submissions, with a few systems rated close to human mocap. Appropriateness seems far from being solved, with most submissions performing in a narrow range slightly above chance, far behind natural motion. The efect of the interlocutor is even more subtle, with submitted systems at best performing barely above chance. Interestingly, a dyadic system being highly appropriate for agent speech does not necessarily imply high appropriateness for the interlocutor. Additional material is available via the project website at svito-zar.github.io/GENEAchallenge2023/.

Ort, förlag, år, upplaga, sidor
Association for Computing Machinery (ACM) , 2023. s. 792-801
Nyckelord [en]
gesture generation, embodied conversational agents, evaluation paradigms, dyadic interaction, interlocutor awareness
Nationell ämneskategori
Språkteknologi (språkvetenskaplig databehandling)
Identifikatorer
URN: urn:nbn:se:kth:diva-343599DOI: 10.1145/3577190.3616120ISI: 001147764700098Scopus ID: 2-s2.0-85170511127ISBN: 9798400700552 (tryckt)OAI: oai:DiVA.org:kth-343599DiVA, id: diva2:1840355
Konferens
25th International Conference on Multimodal Interaction (ICMI), OCT 09-13, 2023, Sorbonne Univ, Paris, France.
Anmärkning

Part of ISBN: 979-840070055-2

QC 20240223

Tillgänglig från: 2024-02-23 Skapad: 2024-02-23 Senast uppdaterad: 2024-02-26Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltextScopus

Person

Kucherenko, TarasNagy, RajmundHenter, Gustav Eje

Sök vidare i DiVA

Av författaren/redaktören
Kucherenko, TarasNagy, RajmundHenter, Gustav Eje
Av organisationen
Robotik, perception och lärande, RPLTal, musik och hörsel, TMH
Språkteknologi (språkvetenskaplig databehandling)

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetricpoäng

doi
isbn
urn-nbn
Totalt: 24 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf