Estimating speaker characteristics for speech recognition
2009 (English)In: Proceedings of Fonetik 2009 / [ed] Peter Branderud, Hartmut Traunmüller, Stockholm: Stockholm University, 2009, 154-158 p.Conference paper (Other academic)
A speaker-characteristic-based hierarchic tree of speech recognition models is designed. The leaves of the tree contain model sets, which are created by transforming a conventionally trained set using leaf-specific speaker profile vectors. The non-leaf models are formed by merging the models of their child nodes. During recognition, a maximum likelihood criterion is followed to traverse the tree from the root to a leaf. The computational load for estimating one- (vocal tract length) and fourdimensional speaker profile vectors (vocal tractlength, two spectral slope parameters andmodel variance scaling) is reduced to a fraction compared to that of an exhaustive search among all leaf nodes. Recognition experiments on children’s connected digits using adult models exhibit similar recognition performance for the exhaustive and the one-dimensional tree search. Further error reduction is achieved with the four-dimensional tree. The estimated speaker properties are analyzed and discussed.
Place, publisher, year, edition, pages
Stockholm: Stockholm University, 2009. 154-158 p.
Computer Science Language Technology (Computational Linguistics)
IdentifiersURN: urn:nbn:se:kth:diva-52101ISBN: 978-91-633-4892-1ISBN: 978-91-633-4893-8OAI: oai:DiVA.org:kth-52101DiVA: diva2:465396
Fonetik 2009, June 10-12, 2009, Stockholm
tmh_import_11_12_14. QC 201201122011-12-142011-12-142012-01-12Bibliographically approved