Computing the fundamental frequency variation spectrum in conversational spoken dialogue systems
2008 (English)In: Proceedings of Acoustics'08, Paris, France, 2008, 3305-3310 p.Conference paper (Refereed)
Continuous modeling of intonation in natural speech has long been hampered by a focus on modeling fundamental frequency, of which several normative aspects are particularly problematic. The latter include, among others, the fact that pitch is unde?ned in unvoiced segments, that its absolute magnitude is speaker-specific, and that its robust estimation and modeling, at a particular point in time, rely on a patchwork of long-time stability heuristics. In the present work, we continue our analysis of the fundamental frequency variation (FFV) spectrum, a recently proposed instantaneous, continuous, vector-valued representation of pitch variation, which is obtained by comparing the harmonic structure of the frequency magnitude spectra of the left and right half of an analysis frame. We analyze the sensitivity of a task-specific error rate in a conversational spoken dialogue system to the specific definition of the left and right halves of a frame, resulting in operational recommendations regarding the framing policy and window shape.
Place, publisher, year, edition, pages
Paris, France, 2008. 3305-3310 p.
Computer Science Language Technology (Computational Linguistics)
IdentifiersURN: urn:nbn:se:kth:diva-52010ScopusID: 2-s2.0-84874840188OAI: oai:DiVA.org:kth-52010DiVA: diva2:465303
Acoustics'08, June 29-July 4, 2008. Paris
tmh_import_11_12_14 QC 201112162011-12-142011-12-142011-12-16Bibliographically approved