kth.sePublications
System disruptions
We are currently experiencing disruptions on the search portals due to high traffic. We are working to resolve the issue, you may temporarily encounter an error message.
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Projection of Turn Completion in Incremental Spoken Dialogue Systems
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.ORCID iD: 0000-0003-3513-4132
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.ORCID iD: 0000-0002-8579-1790
2021 (English)In: SIGDIAL 2021: SIGDIAL 2021 - 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue, Proceedings of the Conference, Virtual, Singapore 29 July 2021 through 31 July 2021, ASSOC COMPUTATIONAL LINGUISTICS , 2021, p. 431-437Conference paper, Published paper (Refereed)
Abstract [en]

The ability to take turns in a fluent way (i.e., without long response delays or frequent interruptions) is a fundamental aspect of any spoken dialog system. However, practical speech recognition services typically induce a long response delay, as it takes time before the processing of the user's utterance is complete. There is a considerable amount of research indicating that humans achieve fast response times by projecting what the interlocutor will say and estimating upcoming turn completions. In this work, we implement this mechanism in an incremental spoken dialog system, by using a language model that generates possible futures to project upcoming completion points. In theory, this could make the system more responsive, while still having access to semantic information not yet processed by the speech recognizer. We conduct a small study which indicates that this is a viable approach for practical dialog systems, and that this is a promising direction for future research.

Place, publisher, year, edition, pages
ASSOC COMPUTATIONAL LINGUISTICS , 2021. p. 431-437
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:kth:diva-304761ISI: 000707001800045Scopus ID: 2-s2.0-85136067428OAI: oai:DiVA.org:kth-304761DiVA, id: diva2:1610961
Conference
22nd Annual Meeting of the Special-Interest-Group-on-Discourse-and-Dialogue (SIGDIAL), JUL 29-31, 2021, Singapore, SINGAPORE
Projects
tmh_turntaking
Note

Part of proceedings: ISBN 978-1-954085-81-7, QC 20230117

Available from: 2021-11-12 Created: 2021-11-12 Last updated: 2024-10-24Bibliographically approved

Open Access in DiVA

No full text in DiVA

Scopus

Authority records

Ekstedt, ErikSkantze, Gabriel

Search in DiVA

By author/editor
Ekstedt, ErikSkantze, Gabriel
By organisation
Speech, Music and Hearing, TMH
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 89 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf