Predicting Unseen Articulations from Multi-speaker Articulatory Models
2010 (English)In: Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010, Makuhari, Japan, 2010, 1588-1591 p.Conference paper (Refereed)
In order to study inter-speaker variability, this work aims to assessthe generalization capabilities of data-based multi-speakerarticulatory models. We use various three-mode factor analysistechniques to model the variations of midsagittal vocal tractcontours obtained from MRI images for three French speakersarticulating 73 vowels and consonants. Articulations of agiven speaker for phonemes not present in the training set arethen predicted by inversion of the models from measurementsof these phonemes articulated by the other subjects. On the average,the prediction RMSE was 5.25 mm for tongue contours,and 3.3 mm for 2D midsagittal vocal tract distances. Besides,this study has established a methodology to determine the optimalnumber of factors for such models.
Place, publisher, year, edition, pages
Makuhari, Japan, 2010. 1588-1591 p.
Factor analysis, Multi-speaker articulatory model
Computer Science Language Technology (Computational Linguistics)
IdentifiersURN: urn:nbn:se:kth:diva-52154ISI: 000313086500009ScopusID: 2-s2.0-79959825917ISBN: 978-1-61782-123-3OAI: oai:DiVA.org:kth-52154DiVA: diva2:465449
11th Annual Conference of the International Speech Communication Association: Spoken Language Processing for All, INTERSPEECH 2010; Makuhari, Chiba; 26 September 2010 through 30 September 2010
tmh_import_11_12_14. QC 201112202011-12-142011-12-142014-01-09Bibliographically approved