kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Mhm... Yeah? Okay! Evaluating the Naturalness and Communicative Function of Synthesized Feedback Responses in Spoken Dialogue
Aix-Marseille University, Marseille, France; Furhat Robotics.
Constructor Technology.
Aix-Marseille University, Marseille, France.
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH. Furhat Robotics.ORCID iD: 0000-0002-8579-1790
2024 (English)In: Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue / [ed] Tatsuya Kawahara, Vera Demberg, Stefan Ultes, Koji Inoue, Shikib Mehri, David Howcroft, Kazunori Komatani, Association for Computational Linguistics (ACL) , 2024, p. 544-553Conference paper, Published paper (Refereed)
Abstract [en]

To create conversational systems with human-like listener behavior, generating short feedback responses (e.g., “mhm”, “ah”, “wow”) appropriate for their context is crucial. These responses convey their communicative function through their lexical form and their prosodic realization. In this paper, we transplant the prosody of feedback responses from human-human U.S. English telephone conversations to a target speaker using two synthesis techniques (TTS and signal processing). Our evaluation focuses on perceived naturalness, contextual appropriateness and preservation of communicative function. Results indicate TTS-generated feedback were perceived as more natural than signal-processing-based feedback, with no significant difference in appropriateness. However, the TTS did not consistently convey the communicative function of the original feedback.

Place, publisher, year, edition, pages
Association for Computational Linguistics (ACL) , 2024. p. 544-553
National Category
Natural Language Processing
Identifiers
URN: urn:nbn:se:kth:diva-359145DOI: 10.18653/v1/2024.sigdial-1.46OAI: oai:DiVA.org:kth-359145DiVA, id: diva2:1931341
Conference
25th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL 2024)
Note

QC 20250203

Available from: 2025-01-27 Created: 2025-01-27 Last updated: 2025-02-07Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Authority records

Skantze, Gabriel

Search in DiVA

By author/editor
Skantze, Gabriel
By organisation
Speech, Music and Hearing, TMH
Natural Language Processing

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 45 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf