Change search
ReferencesLink to record
Permanent link

Direct link
Model space size scaling for speaker adaptation
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Speech Communication and Technology.
2011 (English)In: Proceedings of Fonetik 2011, Stockholm: KTH Royal Institute of Technology, 2011, Vol. 51, no 1, 77-80 p.Conference paper (Other academic)
Abstract [en]

In the current work, instantaneous adaptation in speech recognition is performedby estimating speaker properties, which modify the original trained acousticmodels. We introduce a new property, the size of the model space, which isincluded to the previously used features, VTLN and spectral slope. These arejointly estimated for each test utterance. The new feature has shown to be effectivefor recognition of children’s speech using adult-trained models in TIDIGITS.Adding the feature lowered the error rate by around 10% relative. The overallcombination of VTLN, spectral slope and model space scaling represents asubstantial 31% relative reduction compared with single VTLN. There was noimprovement among adult speakers in TIDIGITS and in TIMIT. Improvement forthis speaker category is expected when the training and test sets are recorded indifferent conditions, such as read and spontaneous speech.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2011. Vol. 51, no 1, 77-80 p.
Trita-TMH, ISSN 1104-5787 ; 2011:1
National Category
Computer Science Language Technology (Computational Linguistics)
URN: urn:nbn:se:kth:diva-52216OAI: diva2:465512
Fonetik 2011, 8-10 juni 2011, Stockholm
tmh_import_11_12_14. QC 20120112Available from: 2011-12-14 Created: 2011-12-14 Last updated: 2012-01-12Bibliographically approved

Open Access in DiVA

No full text

Other links

Search in DiVA

By author/editor
Blomberg, Mats
By organisation
Speech Communication and Technology
Computer ScienceLanguage Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 14 hits
ReferencesLink to record
Permanent link

Direct link