A general-purpose 32 ms prosodic vector for Hidden Markov Modeling
2009 (English)In: Proceedings of Interspeech 2009, Brighton, UK: ISCA , 2009, 724-729 p.Conference paper (Refereed)
Prosody plays a central role in communicating via speech, making it important for speech technologies to model. Unfortunately, the application of standard modeling techniques to the acoustics of prosody has been hindered by difﬁculties in modeling intonation. In this work, we explore the suitability of the recently introduced fundamental frequency variation (FFV) spectrum as a candidate general representation of tone. Experimentson 4 tasks demontrate that FFV features are complimentary to other acoustic measures of prosody and that hidden Markov models offer a suitable modeling paradigm. Proposed improvements yield a 35% relative decrease in error on unseen data and simultaneously reduce time complexity by more than an order of magnitude. The resulting is sufﬁciently mature for general deployment in a broad range of automatic speech processing applications.
Place, publisher, year, edition, pages
Brighton, UK: ISCA , 2009. 724-729 p.
Computer Science Language Technology (Computational Linguistics)
IdentifiersURN: urn:nbn:se:kth:diva-52011ScopusID: 2-s2.0-70450194699OAI: oai:DiVA.org:kth-52011DiVA: diva2:465304
Interspeech 2009, Brighton, UK
tmh_import_11_12_14 QC 201112162011-12-142011-12-142011-12-16Bibliographically approved