Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Analysis of vibratory states in phonation using spectral features of the electroglottographic signal
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH. (Sound and Music Computing)
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH. (Sound and Music Computing)ORCID iD: 0000-0002-3362-7518
2014 (English)In: The journal of the Acoustical Society of America, ISSN 0001-4966, Vol. 136, no 5, 2773-2783 p.Article in journal (Refereed) Published
Abstract [en]

The vocal folds can oscillate in several different ways, manifest to practitioners and clinicians as ‘registers’ or ‘mechanisms’, of which the two most commonly considered are modal voice and falsetto voice. Here these will be taken as instances of different ‘vibratory states’, i.e., distinct quasi-stationary patterns of vibration of the vocal folds. State transitions are common in biomechanical nonlinear oscillators; and they are often abrupt and impossible to predict exactly. Switching state is much like switching to a different voice. Therefore, vibratory states are a source of confounding variation, for instance, when acquiring a voice range profile (VRP). In the quest for a state-aware, non-invasive VRP, a semi-automatic method based on the short-term spectrum of the electroglottographic signal (EGG) was developed. The method identifies rapid vibratory state transitions, such as the modal-falsetto switch, and clusters the EGG data based on their similarities in the relative levels and phases of the lower frequency components. Productions of known modal and falsetto voice were accurately clustered by a Gaussian mixture model. When mapped into the VRP, this EGG-based clustering revealed connected regions of different vibratory sub-regimes in both modal and falsetto.

Place, publisher, year, edition, pages
Acoustical Society of America (ASA), 2014. Vol. 136, no 5, 2773-2783 p.
Keyword [en]
voice function, phonation, vocal registers, electroglottography, vocal fold vibrations
National Category
Fluid Mechanics and Acoustics
Identifiers
URN: urn:nbn:se:kth:diva-145677DOI: 10.1121/1.4896466ISI: 000344989000046Scopus ID: 2-s2.0-84908587626OAI: oai:DiVA.org:kth-145677DiVA: diva2:719660
Projects
FonaDyn
Funder
Swedish Research Council, 2010-4565
Note

Updated from submitted to published.

QC 20140815

Available from: 2014-05-26 Created: 2014-05-26 Last updated: 2015-01-14Bibliographically approved
In thesis
1. Electroglottographic analysis of phonatory dynamics and states
Open this publication in new window or tab >>Electroglottographic analysis of phonatory dynamics and states
2014 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

The human voice is a product of an intricate biophysical system. The complexity of this system enables a rich variety of possible sounds, but at the same time poses great challenges for quantitative voice analysis. For example, the vocal folds can vibrate in several different ways, leading to variations in the acoustic output. Because the vocal folds are relatively inaccessible, such variations are often difficult to account for. This work proposes a novel method for extracting non-invasively information on the vibratory state of the human vocal folds. Such information is important for creating a more complete voice analysis scheme. Invasive methods are undesirable because they often disturb the subjects and/or the studied phenomena, and they are also impractical in terms of accessibility and cost. A useful frame of reference for voice analysis is the Voice Range Profile (VRP). The 3 dimensional form of the VRP can be used to depict any phonatory metric over the 2 dimensional plane defined by the fundamental frequency of phonation (x-axis) and the sound pressure level (y-axis). The primary goal of this work was to incorporate information on the vibratory state of the vocal folds into the Voice Range Profile (e.g., as a color change). For this purpose, a novel method of analysis of the electroglottogram (EGG) was developed, using techniques from machine learning (clustering) and nonlinear time series analysis (sample entropy estimation). The analysis makes no prior assumptions on the nature of the EGG signal and does not rely on its absolute amplitude or frequency. Unlike time-domain methods, which typically define thresholds for quantifying EGG cycle metrics, the proposed method uses information from the entire cycle of each period. The analysis was applied in a variety of experimental conditions (constant vowel with different vibratory states, constant vibratory state and different vowels, constant vowel and vibratory state with varying lung volume) and the magnitude of effect on the EGG short-term spectrum was estimated for each of these conditions. It was found that the short-term spectrum of the EGG signal sufficed to discriminate between different phonatory configurations, such as modal and falsetto voice. It was found also that even supposedly purely articulatory changes could be traced in the spectrum of the EGG signal. Finally, possible pedagogical and clinical applications of the method are discussed.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2014. vii, 31 p.
Series
TRITA-CSC-A, ISSN 1653-5723 ; 2014:09
Keyword
voice function, phonation, vocal fold vibration, vocal registers, electroglottography
National Category
Other Natural Sciences
Identifiers
urn:nbn:se:kth:diva-145692 (URN)987-91-7595-189-8 (ISBN)
Presentation
2014-06-13, sal Fantum, Lindstedsvägen 24, KTH, Stockholm, 15:15 (English)
Opponent
Supervisors
Projects
FonaDyn
Note

QC 20140609

Available from: 2014-06-09 Created: 2014-05-26 Last updated: 2014-06-09Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textScopusPublisher's website

Authority records BETA

Ternström, Sten

Search in DiVA

By author/editor
Selamtzis, AndreasTernström, Sten
By organisation
Speech, Music and Hearing, TMH
Fluid Mechanics and Acoustics

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 136 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf