Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Electroglottographic analysis of phonatory dynamics and states
KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH. (Sound and Music Computing)
2014 (Engelska)Licentiatavhandling, sammanläggning (Övrigt vetenskapligt)
Abstract [en]

The human voice is a product of an intricate biophysical system. The complexity of this system enables a rich variety of possible sounds, but at the same time poses great challenges for quantitative voice analysis. For example, the vocal folds can vibrate in several different ways, leading to variations in the acoustic output. Because the vocal folds are relatively inaccessible, such variations are often difficult to account for. This work proposes a novel method for extracting non-invasively information on the vibratory state of the human vocal folds. Such information is important for creating a more complete voice analysis scheme. Invasive methods are undesirable because they often disturb the subjects and/or the studied phenomena, and they are also impractical in terms of accessibility and cost. A useful frame of reference for voice analysis is the Voice Range Profile (VRP). The 3 dimensional form of the VRP can be used to depict any phonatory metric over the 2 dimensional plane defined by the fundamental frequency of phonation (x-axis) and the sound pressure level (y-axis). The primary goal of this work was to incorporate information on the vibratory state of the vocal folds into the Voice Range Profile (e.g., as a color change). For this purpose, a novel method of analysis of the electroglottogram (EGG) was developed, using techniques from machine learning (clustering) and nonlinear time series analysis (sample entropy estimation). The analysis makes no prior assumptions on the nature of the EGG signal and does not rely on its absolute amplitude or frequency. Unlike time-domain methods, which typically define thresholds for quantifying EGG cycle metrics, the proposed method uses information from the entire cycle of each period. The analysis was applied in a variety of experimental conditions (constant vowel with different vibratory states, constant vibratory state and different vowels, constant vowel and vibratory state with varying lung volume) and the magnitude of effect on the EGG short-term spectrum was estimated for each of these conditions. It was found that the short-term spectrum of the EGG signal sufficed to discriminate between different phonatory configurations, such as modal and falsetto voice. It was found also that even supposedly purely articulatory changes could be traced in the spectrum of the EGG signal. Finally, possible pedagogical and clinical applications of the method are discussed.

Ort, förlag, år, upplaga, sidor
Stockholm: KTH Royal Institute of Technology, 2014. , s. vii, 31
Serie
TRITA-CSC-A, ISSN 1653-5723 ; 2014:09
Nyckelord [en]
voice function, phonation, vocal fold vibration, vocal registers, electroglottography
Nationell ämneskategori
Annan naturvetenskap
Identifikatorer
URN: urn:nbn:se:kth:diva-145692ISBN: 9789175951898 (tryckt)OAI: oai:DiVA.org:kth-145692DiVA, id: diva2:719675
Presentation
2014-06-13, sal Fantum, Lindstedsvägen 24, KTH, Stockholm, 15:15 (Engelska)
Opponent
Handledare
Projekt
FonaDyn
Anmärkning

QC 20140609

Tillgänglig från: 2014-06-09 Skapad: 2014-05-26 Senast uppdaterad: 2019-01-24Bibliografiskt granskad
Delarbeten
1. Analysis of vibratory states in phonation using spectral features of the electroglottographic signal
Öppna denna publikation i ny flik eller fönster >>Analysis of vibratory states in phonation using spectral features of the electroglottographic signal
2014 (Engelska)Ingår i: The journal of the Acoustical Society of America, ISSN 0001-4966, Vol. 136, nr 5, s. 2773-2783Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

The vocal folds can oscillate in several different ways, manifest to practitioners and clinicians as ‘registers’ or ‘mechanisms’, of which the two most commonly considered are modal voice and falsetto voice. Here these will be taken as instances of different ‘vibratory states’, i.e., distinct quasi-stationary patterns of vibration of the vocal folds. State transitions are common in biomechanical nonlinear oscillators; and they are often abrupt and impossible to predict exactly. Switching state is much like switching to a different voice. Therefore, vibratory states are a source of confounding variation, for instance, when acquiring a voice range profile (VRP). In the quest for a state-aware, non-invasive VRP, a semi-automatic method based on the short-term spectrum of the electroglottographic signal (EGG) was developed. The method identifies rapid vibratory state transitions, such as the modal-falsetto switch, and clusters the EGG data based on their similarities in the relative levels and phases of the lower frequency components. Productions of known modal and falsetto voice were accurately clustered by a Gaussian mixture model. When mapped into the VRP, this EGG-based clustering revealed connected regions of different vibratory sub-regimes in both modal and falsetto.

Ort, förlag, år, upplaga, sidor
Acoustical Society of America (ASA), 2014
Nyckelord
voice function, phonation, vocal registers, electroglottography, vocal fold vibrations
Nationell ämneskategori
Strömningsmekanik och akustik
Identifikatorer
urn:nbn:se:kth:diva-145677 (URN)10.1121/1.4896466 (DOI)000344989000046 ()2-s2.0-84908587626 (Scopus ID)
Projekt
FonaDyn
Forskningsfinansiär
Vetenskapsrådet, 2010-4565
Anmärkning

Updated from submitted to published.

QC 20140815

Tillgänglig från: 2014-05-26 Skapad: 2014-05-26 Senast uppdaterad: 2018-01-25Bibliografiskt granskad

Open Access i DiVA

Electroglottographic analysis of phonatory dynamics and states(3727 kB)550 nedladdningar
Filinformation
Filnamn SUMMARY01.pdfFilstorlek 3727 kBChecksumma SHA-512
530eeefdf69887fe6daecf53d56a56bfb92b3953d4c07fa12ae0c4adb10c495cabafee073256bb80259ef6adbde6746c5f115129230bb80d004730481e53323a
Typ summaryMimetyp application/pdf

Sök vidare i DiVA

Av författaren/redaktören
Selamtzis, Andreas
Av organisationen
Tal, musik och hörsel, TMH
Annan naturvetenskap

Sök vidare utanför DiVA

GoogleGoogle Scholar
Antalet nedladdningar är summan av nedladdningar för alla fulltexter. Det kan inkludera t.ex tidigare versioner som nu inte längre är tillgängliga.

isbn
urn-nbn

Altmetricpoäng

isbn
urn-nbn
Totalt: 416 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf