Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Reconstruction of vocal tract geometries from biomechanical simulations
KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.ORCID iD: 0000-0002-8991-1016
GTM Grup de recerca en Tecnologies Mèdia, La Salle, Universitat Ramon Llull, Barcelona, Spain.
KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.ORCID iD: 0000-0003-4532-014X
GTM Grup de recerca en Tecnologies Mèdia, La Salle, Universitat Ramon Llull, Barcelona, Spain.
2018 (English)In: International Journal for Numerical Methods in Biomedical Engineering, ISSN 2040-7939, E-ISSN 2040-7947Article in journal (Refereed) Published
Abstract [en]

Medical imaging techniques are usually utilized to acquire the vocal tract geometry in 3D, which may then be used, eg, for acoustic/fluid simulation. As an alternative, such a geometry may also be acquired from a biomechanical simulation, which allows to alter the anatomy and/or articulation to study a variety of configurations. In a biomechanical model, each physical structure is described by its geometry and its properties (such as mass, stiffness, and muscles). In such a model, the vocal tract itself does not have an explicit representation, since it is a cavity rather than a physical structure. Instead, its geometry is defined implicitly by all the structures surrounding the cavity, and such an implicit representation may not be suitable for visualization or for acoustic/fluid simulation. In this work, we propose a method to reconstruct the vocal tract geometry at each time step during the biomechanical simulation. Complexity of the problem, which arises from model alignment artifacts, is addressed by the proposed method. In addition to the main cavity, other small cavities, including the piriform fossa, the sublingual cavity, and the interdental space, can be reconstructed. These cavities may appear or disappear by the position of the larynx, the mandible, and the tongue. To illustrate our method, various static and temporal geometries of the vocal tract are reconstructed and visualized. As a proof of concept, the reconstructed geometries of three cardinal vowels are further used in an acoustic simulation, and the corresponding transfer functions are derived.

Place, publisher, year, edition, pages
John Wiley & Sons, 2018.
National Category
Computer Sciences
Research subject
Speech and Music Communication
Identifiers
URN: urn:nbn:se:kth:diva-239055DOI: 10.1002/cnm.3159ISI: 000458548700001OAI: oai:DiVA.org:kth-239055DiVA, id: diva2:1263543
Funder
EU, FP7, Seventh Framework Programme, 308874
Note

QC 20181116

Available from: 2018-11-15 Created: 2018-11-15 Last updated: 2019-04-04Bibliographically approved
In thesis
1. Computational Modeling of the Vocal Tract: Applications to Speech Production
Open this publication in new window or tab >>Computational Modeling of the Vocal Tract: Applications to Speech Production
2018 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Human speech production is a complex process, involving neuromuscular control signals, the effects of articulators' biomechanical properties and acoustic wave propagation in a vocal tract tube of intricate shape. Modeling these phenomena may play an important role in advancing our understanding of the involved mechanisms, and may also have future medical applications, e.g., guiding doctors in diagnosing, treatment planning, and surgery prediction of related disorders, ranging from oral cancer, cleft palate, obstructive sleep apnea, dysphagia, etc.

A more complete understanding requires models that are as truthful representations as possible of the phenomena. Due to the complexity of such modeling, simplifications have nevertheless been used extensively in speech production research: phonetic descriptors (such as the position and degree of the most constricted part of the vocal tract) are used as control signals, the articulators are represented as two-dimensional geometrical models, the vocal tract is considered as a smooth tube and plane wave propagation is assumed, etc.

This thesis aims at firstly investigating the consequences of such simplifications, and secondly at contributing to establishing unified modeling of the speech production process, by connecting three-dimensional biomechanical modeling of the upper airway with three-dimensional acoustic simulations. The investigation on simplifying assumptions demonstrated the influence of vocal tract geometry features — such as shape representation, bending and lip shape — on its acoustic characteristics, and that the type of modeling — geometrical or biomechanical — affects the spatial trajectories of the articulators, as well as the transition of formant frequencies in the spectrogram.

The unification of biomechanical and acoustic modeling in three-dimensions allows to realistically control the acoustic output of dynamic sounds, such as vowel-vowel utterances, by contraction of relevant muscles. This moves and shapes the speech articulators that in turn dene the vocal tract tube in which the wave propagation occurs. The main contribution of the thesis in this line of work is a novel and complex method that automatically reconstructs the shape of the vocal tract from the biomechanical model. This step is essential to link biomechanical and acoustic simulations, since the vocal tract, which anatomically is a cavity enclosed by different structures, is only implicitly defined in a biomechanical model constituted of several distinct articulators.

Place, publisher, year, edition, pages
KTH Royal Institute of Technology, 2018. p. 105
Series
TRITA-EECS-AVL ; 2018:90
Keywords
vocal tract, upper airway, speech production, biomechanical model, acoustic model, vocal tract reconstruction
National Category
Computer Sciences
Research subject
Speech and Music Communication
Identifiers
urn:nbn:se:kth:diva-239071 (URN)978-91-7873-021-6 (ISBN)
Public defence
2018-12-07, D2, Lindstedtsvägen 5, Stockholm, 14:00 (English)
Opponent
Supervisors
Note

QC 20181116

Available from: 2018-11-16 Created: 2018-11-16 Last updated: 2018-11-16Bibliographically approved

Open Access in DiVA

fulltext(2671 kB)33 downloads
File information
File name FULLTEXT01.pdfFile size 2671 kBChecksum SHA-512
0b085f68f4149c35ae728487ed668b5e0de0ab459909af1f058dddcd58f7c90d02be3a43c613f2cded804039bcbf4fc4f634bc2b788d7eba20340d701ad44f8b
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Authority records BETA

Dabbaghchian, SaeedEngwall, Olov

Search in DiVA

By author/editor
Dabbaghchian, SaeedEngwall, Olov
By organisation
Speech, Music and Hearing, TMH
In the same journal
International Journal for Numerical Methods in Biomedical Engineering
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 33 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 167 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf