kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Predicting Unseen Articulations from Multi-speaker Articulatory Models
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Speech Communication and Technology. KTH, School of Computer Science and Communication (CSC), Centres, Centre for Speech Technology, CTT.
GIPSA-Lab, Grenoble University.
GIPSA-Lab, Grenoble University.
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Speech Communication and Technology. KTH, School of Computer Science and Communication (CSC), Centres, Centre for Speech Technology, CTT.ORCID iD: 0000-0003-4532-014X
2010 (English)In: Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010, Makuhari, Japan, 2010, p. 1588-1591Conference paper, Published paper (Refereed)
Abstract [en]

In order to study inter-speaker variability, this work aims to assessthe generalization capabilities of data-based multi-speakerarticulatory models. We use various three-mode factor analysistechniques to model the variations of midsagittal vocal tractcontours obtained from MRI images for three French speakersarticulating 73 vowels and consonants. Articulations of agiven speaker for phonemes not present in the training set arethen predicted by inversion of the models from measurementsof these phonemes articulated by the other subjects. On the average,the prediction RMSE was 5.25 mm for tongue contours,and 3.3 mm for 2D midsagittal vocal tract distances. Besides,this study has established a methodology to determine the optimalnumber of factors for such models.

Place, publisher, year, edition, pages
Makuhari, Japan, 2010. p. 1588-1591
Keywords [en]
Factor analysis, Multi-speaker articulatory model
National Category
Computer Sciences Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:kth:diva-52154ISI: 000313086500009Scopus ID: 2-s2.0-79959825917ISBN: 978-1-61782-123-3 (print)OAI: oai:DiVA.org:kth-52154DiVA, id: diva2:465449
Conference
11th Annual Conference of the International Speech Communication Association: Spoken Language Processing for All, INTERSPEECH 2010; Makuhari, Chiba; 26 September 2010 through 30 September 2010
Note

tmh_import_11_12_14. QC 20111220

Available from: 2011-12-14 Created: 2011-12-14 Last updated: 2024-03-18Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Scopushttp://www.speech.kth.se/prod/publications/files/3466.pdf

Authority records

Ananthakrishnan, GopalEngwall, Olov

Search in DiVA

By author/editor
Ananthakrishnan, GopalEngwall, Olov
By organisation
Speech Communication and TechnologyCentre for Speech Technology, CTT
Computer SciencesLanguage Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 651 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf