Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Electroglottographic analysis of phonatory dynamics and states
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH. (Sound and Music Computing)
2014 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

The human voice is a product of an intricate biophysical system. The complexity of this system enables a rich variety of possible sounds, but at the same time poses great challenges for quantitative voice analysis. For example, the vocal folds can vibrate in several different ways, leading to variations in the acoustic output. Because the vocal folds are relatively inaccessible, such variations are often difficult to account for. This work proposes a novel method for extracting non-invasively information on the vibratory state of the human vocal folds. Such information is important for creating a more complete voice analysis scheme. Invasive methods are undesirable because they often disturb the subjects and/or the studied phenomena, and they are also impractical in terms of accessibility and cost. A useful frame of reference for voice analysis is the Voice Range Profile (VRP). The 3 dimensional form of the VRP can be used to depict any phonatory metric over the 2 dimensional plane defined by the fundamental frequency of phonation (x-axis) and the sound pressure level (y-axis). The primary goal of this work was to incorporate information on the vibratory state of the vocal folds into the Voice Range Profile (e.g., as a color change). For this purpose, a novel method of analysis of the electroglottogram (EGG) was developed, using techniques from machine learning (clustering) and nonlinear time series analysis (sample entropy estimation). The analysis makes no prior assumptions on the nature of the EGG signal and does not rely on its absolute amplitude or frequency. Unlike time-domain methods, which typically define thresholds for quantifying EGG cycle metrics, the proposed method uses information from the entire cycle of each period. The analysis was applied in a variety of experimental conditions (constant vowel with different vibratory states, constant vibratory state and different vowels, constant vowel and vibratory state with varying lung volume) and the magnitude of effect on the EGG short-term spectrum was estimated for each of these conditions. It was found that the short-term spectrum of the EGG signal sufficed to discriminate between different phonatory configurations, such as modal and falsetto voice. It was found also that even supposedly purely articulatory changes could be traced in the spectrum of the EGG signal. Finally, possible pedagogical and clinical applications of the method are discussed.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2014. , vii, 31 p.
Series
TRITA-CSC-A, ISSN 1653-5723 ; 2014:09
Keyword [en]
voice function, phonation, vocal fold vibration, vocal registers, electroglottography
National Category
Other Natural Sciences
Identifiers
URN: urn:nbn:se:kth:diva-145692ISBN: 987-91-7595-189-8 OAI: oai:DiVA.org:kth-145692DiVA: diva2:719675
Presentation
2014-06-13, sal Fantum, Lindstedsvägen 24, KTH, Stockholm, 15:15 (English)
Opponent
Supervisors
Projects
FonaDyn
Note

QC 20140609

Available from: 2014-06-09 Created: 2014-05-26 Last updated: 2014-06-09Bibliographically approved
List of papers
1. Analysis of vibratory states in phonation using spectral features of the electroglottographic signal
Open this publication in new window or tab >>Analysis of vibratory states in phonation using spectral features of the electroglottographic signal
2014 (English)In: The journal of the Acoustical Society of America, ISSN 0001-4966, Vol. 136, no 5, 2773-2783 p.Article in journal (Refereed) Published
Abstract [en]

The vocal folds can oscillate in several different ways, manifest to practitioners and clinicians as ‘registers’ or ‘mechanisms’, of which the two most commonly considered are modal voice and falsetto voice. Here these will be taken as instances of different ‘vibratory states’, i.e., distinct quasi-stationary patterns of vibration of the vocal folds. State transitions are common in biomechanical nonlinear oscillators; and they are often abrupt and impossible to predict exactly. Switching state is much like switching to a different voice. Therefore, vibratory states are a source of confounding variation, for instance, when acquiring a voice range profile (VRP). In the quest for a state-aware, non-invasive VRP, a semi-automatic method based on the short-term spectrum of the electroglottographic signal (EGG) was developed. The method identifies rapid vibratory state transitions, such as the modal-falsetto switch, and clusters the EGG data based on their similarities in the relative levels and phases of the lower frequency components. Productions of known modal and falsetto voice were accurately clustered by a Gaussian mixture model. When mapped into the VRP, this EGG-based clustering revealed connected regions of different vibratory sub-regimes in both modal and falsetto.

Place, publisher, year, edition, pages
Acoustical Society of America (ASA), 2014
Keyword
voice function, phonation, vocal registers, electroglottography, vocal fold vibrations
National Category
Fluid Mechanics and Acoustics
Identifiers
urn:nbn:se:kth:diva-145677 (URN)10.1121/1.4896466 (DOI)000344989000046 ()2-s2.0-84908587626 (Scopus ID)
Projects
FonaDyn
Funder
Swedish Research Council, 2010-4565
Note

Updated from submitted to published.

QC 20140815

Available from: 2014-05-26 Created: 2014-05-26 Last updated: 2015-01-14Bibliographically approved

Open Access in DiVA

Electroglottographic analysis of phonatory dynamics and states(3727 kB)435 downloads
File information
File name SUMMARY01.pdfFile size 3727 kBChecksum SHA-512
530eeefdf69887fe6daecf53d56a56bfb92b3953d4c07fa12ae0c4adb10c495cabafee073256bb80259ef6adbde6746c5f115129230bb80d004730481e53323a
Type summaryMimetype application/pdf

Search in DiVA

By author/editor
Selamtzis, Andreas
By organisation
Speech, Music and Hearing, TMH
Other Natural Sciences

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 140 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf