Model space size scaling for speaker adaptation
2011 (English)In: Proceedings of Fonetik 2011, Stockholm: KTH Royal Institute of Technology, 2011, Vol. 51, no 1, 77-80 p.Conference paper (Other academic)
In the current work, instantaneous adaptation in speech recognition is performedby estimating speaker properties, which modify the original trained acousticmodels. We introduce a new property, the size of the model space, which isincluded to the previously used features, VTLN and spectral slope. These arejointly estimated for each test utterance. The new feature has shown to be effectivefor recognition of children’s speech using adult-trained models in TIDIGITS.Adding the feature lowered the error rate by around 10% relative. The overallcombination of VTLN, spectral slope and model space scaling represents asubstantial 31% relative reduction compared with single VTLN. There was noimprovement among adult speakers in TIDIGITS and in TIMIT. Improvement forthis speaker category is expected when the training and test sets are recorded indifferent conditions, such as read and spontaneous speech.
Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2011. Vol. 51, no 1, 77-80 p.
Trita-TMH, ISSN 1104-5787 ; 2011:1
Computer Science Language Technology (Computational Linguistics)
IdentifiersURN: urn:nbn:se:kth:diva-52216OAI: oai:DiVA.org:kth-52216DiVA: diva2:465512
Fonetik 2011, 8-10 juni 2011, Stockholm
tmh_import_11_12_14. QC 201201122011-12-142011-12-142012-01-12Bibliographically approved