Measuring final lengthening for speaker-change prediction
2011 (English)In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, Florence, Italy, 2011, 2076-2079 p.Conference paper (Refereed)
We explore pre-silence syllabic lengthening as a cue for next-speakership prediction in spontaneous dialogue. When estimated using a transcription-mediated procedure, lengthening is shown to reduce error rates by 25% relative to majority class guessing. This indicates that lengthening should be exploited by dialogue systems. With that in mind, we evaluate an automatic measure of spectral envelope change, Mel-spectral flux (MSF), and show that its performance is at least as good as that of the transcription-mediated measure. Modeling MSF is likely to improve turn uptake in dialogue systems, and to benefit other applications needing an estimate of durational variability in speech.
Place, publisher, year, edition, pages
Florence, Italy, 2011. 2076-2079 p.
End-of-turn prediction, Final lengthening, Rate of speech, Turn-taking
Computer Science Language Technology (Computational Linguistics)
IdentifiersURN: urn:nbn:se:kth:diva-52199ISI: 000316502201008ScopusID: 2-s2.0-84865782515ISBN: 978-1-61839-270-1OAI: oai:DiVA.org:kth-52199DiVA: diva2:465497
12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011; Florence; Italy; 27 August 2011 through 31 August 2011
tmh_import_11_12_14. QC 201201192011-12-142011-12-142014-01-16Bibliographically approved