Dynamic vocal tract length normalization in speech recognition
2010 (English)In: Proceedings from Fonetik 2010: Working Papers 54, Centre for Languages and Literature, Lund University, Sweden, 2010, 29-34 p.Conference paper (Other academic)
A novel method to account for dynamic speaker characteristic properties in aspeech recognition system is presented. The estimated trajectory of a property canbe constrained to be constant or to have a limited rate-of-change within a phone ora sub-phone state. The constraints are implemented by extending each state in thetrained Hidden Markov Model by a number of property-value-specific sub-statestransformed from the original model. The connections in the transition matrix ofthe extended model define possible slopes of the trajectory. Constraints on itsdynamic range during an utterance are implemented by decomposing the trajectoryinto a static and a dynamic component. Results are presented on vocal tract lengthnormalization in connected-digit recognition of children's speech using modelstrained on male adult speech. The word error rate was reduced compared with theconventional utterance-specific warping factor by 10% relative.
Place, publisher, year, edition, pages
Centre for Languages and Literature, Lund University, Sweden, 2010. 29-34 p.
Computer Science Language Technology (Computational Linguistics)
IdentifiersURN: urn:nbn:se:kth:diva-52153OAI: oai:DiVA.org:kth-52153DiVA: diva2:465448
Fonetik 2010, Lund, June 2-4, 2010
QC 20120111. tmh_import_11_12_142011-12-142011-12-142012-01-11Bibliographically approved