kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Representation of perceived prosodic similarity of conversational feedback
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.ORCID iD: 0000-0002-7885-5477
KTH.
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.ORCID iD: 0000-0002-8579-1790
2025 (English)In: Interspeech 2025, International Speech Communication Association , 2025, p. 374-378Conference paper, Published paper (Refereed)
Abstract [en]

Vocal feedback (e.g., 'mhm', 'yeah', 'okay') is an important component of spoken dialogue and is crucial to ensuring common ground in conversational systems. The exact meaning of such feedback is conveyed through both lexical and prosodic form. In this work, we investigate the perceived prosodic similarity of vocal feedback with the same lexical form, and to what extent existing speech representations reflect such similarities. A triadic comparison task with recruited participants is used to measure perceived similarity of feedback responses taken from two different datasets. We find that spectral and self-supervised speech representations encode prosody better than extracted pitch features, especially in the case of feedback from the same speaker. We also find that it is possible to further condense and align the representations to human perception through contrastive learning.

Place, publisher, year, edition, pages
International Speech Communication Association , 2025. p. 374-378
Keywords [en]
dialogue systems, feedback, feedback analysis, human-computer interaction, prosodic similarity, prosody
National Category
Comparative Language Studies and Linguistics Natural Language Processing Computer Sciences
Identifiers
URN: urn:nbn:se:kth:diva-372787DOI: 10.21437/Interspeech.2025-1771Scopus ID: 2-s2.0-105020035147OAI: oai:DiVA.org:kth-372787DiVA, id: diva2:2015308
Conference
26th Interspeech Conference 2025, Rotterdam, Netherlands, Kingdom of the, August 17-21, 2025
Note

QC 20251120

Available from: 2025-11-20 Created: 2025-11-20 Last updated: 2025-11-20Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Qian, LiviaSkantze, Gabriel

Search in DiVA

By author/editor
Qian, LiviaFigueroa, CarolSkantze, Gabriel
By organisation
Speech, Music and Hearing, TMHKTH
Comparative Language Studies and LinguisticsNatural Language ProcessingComputer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 32 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf