Exploring the Predictability of Non-Unique Acoustic-to-Articulatory Mappings
2012 (English)In: IEEE Transactions on Audio, Speech, and Language Processing, ISSN 1558-7916, Vol. 20, no 10, 2672-2682 p.Article in journal (Refereed) Published
This paper explores statistical tools that help analyze the predictability in the acoustic-to-articulatory inversion of speech, using an Electromagnetic Articulography database of simultaneously recorded acoustic and articulatory data. Since it has been shown that speech acoustics can be mapped to non-unique articulatory modes, the variance of the articulatory parameters is not sufficient to understand the predictability of the inverse mapping. We, therefore, estimate an upper bound to the conditional entropy of the articulatory distribution. This provides a probabilistic estimate of the range of articulatory values (either over a continuum or over discrete non-unique regions) for a given acoustic vector in the database. The analysis is performed for different British/Scottish English consonants with respect to which articulators (lips, jaws or the tongue) are important for producing the phoneme. The paper shows that acoustic-articulatory mappings for the important articulators have a low upper bound on the entropy, but can still have discrete non-unique configurations.
Place, publisher, year, edition, pages
2012. Vol. 20, no 10, 2672-2682 p.
Acoustic-to-articulatory inversion, entropy of GMM (Gaussian mixture model), many-to-one-mapping
Language Technology (Computational Linguistics)
IdentifiersURN: urn:nbn:se:kth:diva-104992DOI: 10.1109/TASL.2012.2210876ISI: 000309600500005ScopusID: 2-s2.0-84867169172OAI: oai:DiVA.org:kth-104992DiVA: diva2:570068
FunderSwedish Research Council, 80449001
QC 201211162012-11-162012-11-152012-11-16Bibliographically approved