Data-driven models for timing feedback responses in a Map Task dialogue system
2014 (English)In: Computer speech & language (Print), ISSN 0885-2308, E-ISSN 1095-8363, Vol. 28, no 4, 903-922 p.Article in journal (Refereed) Published
Traditional dialogue systems use a fixed silence threshold to detect the end of users' turns. Such a simplistic model can result in system behaviour that is both interruptive and unresponsive, which in turn affects user experience. Various studies have observed that human interlocutors take cues from speaker behaviour, such as prosody, syntax, and gestures, to coordinate smooth exchange of speaking turns. However, little effort has been made towards implementing these models in dialogue systems and verifying how well they model the turn-taking behaviour in human computer interactions. We present a data-driven approach to building models for online detection of suitable feedback response locations in the user's speech. We first collected human computer interaction data using a spoken dialogue system that can perform the Map Task with users (albeit using a trick). On this data, we trained various models that use automatically extractable prosodic, contextual and lexico-syntactic features for detecting response locations. Next, we implemented a trained model in the same dialogue system and evaluated it in interactions with users. The subjective and objective measures from the user evaluation confirm that a model trained on speaker behavioural cues offers both smoother turn-transitions and more responsive system behaviour.
Place, publisher, year, edition, pages
2014. Vol. 28, no 4, 903-922 p.
Spoken dialogue systems, Timing feedback, Turn-taking, User evaluation
Computer Science Language Technology (Computational Linguistics)
IdentifiersURN: urn:nbn:se:kth:diva-147402DOI: 10.1016/j.csl.2014.02.002ISI: 000336694200005ScopusID: 2-s2.0-84900533798OAI: oai:DiVA.org:kth-147402DiVA: diva2:731672
QC 201407022014-07-022014-06-272014-07-02Bibliographically approved