Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Investigation of the relationship between electroglottogram waveform, fundamental frequency, and sound pressure level using clustering
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.ORCID iD: 0000-0003-2995-1363
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.ORCID iD: 0000-0002-3362-7518
2017 (English)In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 31, no 4, p. 393-400Article in journal (Refereed) Published
Abstract [en]

Although it has been shown in previous research (Orlikoff, 1991; Henrich et al, 2005; Kuang et al, 2014; Awan, 2015) that there exists a relationship between the electroglottogram (EGG) waveform and the acoustic signal, this relationship is still not fully understood. To investigate this relationship, the EGG and acoustic signals were measured for four male amateur choir singers who each produced eight consecutive tones of increasing and decreasing vocal intensity. The EGG signals were processed cycle-synchronously to obtain the discrete Fourier transform, and the data were used as an input to a clustering algorithm. The acoustic signal was analyzed in terms of sound pressure level (dB SPL) and fundamental frequency (f(o)) of vibration, and the results of both EGG and acoustic analysis were depicted on a two-dimensional plane with f(o) on the x-axis and SPL on the y-axis. All the subjects were seen to have a weak, near-sinusoidal EGG waveform in their lowest SPL range, whereas increase in SPL coincided with progressive enrichment in harmonic content of the EGG waveforms. The results of the clustering were additionally used to classify waveforms across subjects to enable inter-subject comparisons and assessment of individual strategies of exploring the f(o)-SPL dimensions. In these male subjects, the EGG waveform shape appeared to vary with SPL and to remain essentially constant with f(o) over one octave.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2017. Vol. 31, no 4, p. 393-400
National Category
Fluid Mechanics and Acoustics
Identifiers
URN: urn:nbn:se:kth:diva-211744DOI: 10.1016/j.jvoice.2016.11.003ISI: 000406147000001PubMedID: 27939138Scopus ID: 2-s2.0-85008154357OAI: oai:DiVA.org:kth-211744DiVA, id: diva2:1133380
Funder
Swedish Research Council, 2010-4565 2013-0642
Note

QC 20170815

Available from: 2017-08-15 Created: 2017-08-15 Last updated: 2018-01-25Bibliographically approved
In thesis
1. Analyses of voice and glottographic signals in singing and speech
Open this publication in new window or tab >>Analyses of voice and glottographic signals in singing and speech
2018 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Recent advances in machine learning and time series analysis techniques have brought new perspectives to a great number of scientific fields. This thesis contributes applications of such techniques to voice analysis, in an attempt to extract information on the vibration of the vocal folds as such, as well as on the radiated acoustic signal. The data that was analyzed in this work are acoustic recordings, electroglottographic (EGG) signals and transnasal high- speed videoendoscopic images. The data analysis techniques are primarily based on clustering, i.e., grouping of data based on similarity, and sample entropy analysis, i.e., quantifying the degree of irregularity in a given signal. The experiments were conducted so as to provide data for different types of vibratory behaviors (or vibratory states) of the vocal folds. Clustering was used in order to categorize in an unsupervised fashion these different vi- bratory states, based solely on the electroglottographic signal, or the glottal area waveform, or both. Sample entropy was utilized as an indicator of in- stabilities, when subjects produced voiced sounds using irregular vibratory patterns, such as register breaks, intermittent diplophonia, and other types of irregularities. The prominent role of sound pressure level and fundamental frequency motivated further study of the relationship between them and the shape of the electroglottographic waveform. Graphical representations were created to visualize the relationship between different vibratory behaviors with fundamental frequency and sound pressure level. The EGG waveform shape was seen to depend strongly on sound pressure level and somewhat less on fundamental frequency. In very soft phonation, the almost sinusoidal waveform of the EGG suggests that studying the EGG using clusters may give a better representation compared to conventional time-domain metrics. The paradigm of the clustering was later applied in synchronous recordings of electroglottogram and glottal area waveforms in professional tenor singers. Different vibratory states were classified successfully using clustering, and the electroglottogram was seen to be as good as the glottal area waveform for such a classification task. The last part of this work concerns voices from subjects with organic dysphonia. A study was dedicated to investigate how vowel context (sustained versus excerpted from speech) can affect the power of quantitative acoustic measures to discriminate dysphonic subjects from controls. Two acoustic voice quality measures were used: the cepstral peak prominence (smoothed) and sample entropy. The cepstral peak prominence (smoothed) showed better discriminatory power with excerpted vowels, while sample entropy with sustained vowels. Additionally, it was found that sample entropy was strongly correlated with cepstral peak prominence (smoothed) and with the perceptual quality of breathiness. 

Place, publisher, year, edition, pages
Stockholm, Sweden: KTH Royal Institute of Technology, 2018. p. 55
Series
TRITA-EECS-AVL ; 2018:6
Keywords
voice ; singing ; electroglottography ; clustering ; dysphonia ; sample entropy ;
National Category
Other Natural Sciences
Research subject
Speech and Music Communication
Identifiers
urn:nbn:se:kth:diva-221825 (URN)978-91-7729-668-3 (ISBN)
Public defence
2018-02-23, F3, Lindstedtsvägen 26, Stockholm, 13:30 (English)
Opponent
Supervisors
Projects
Phonatory dynamics and states
Funder
Swedish Research Council, 2010-4565Swedish Research Council, 2013-0632
Note

QC 20180126

Available from: 2018-01-26 Created: 2018-01-25 Last updated: 2018-01-26Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMedScopus

Authority records BETA

Ternström, Sten

Search in DiVA

By author/editor
Selamtzis, AndreasTernström, Sten
By organisation
Speech, Music and Hearing, TMH
In the same journal
Journal of Voice
Fluid Mechanics and Acoustics

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 13 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf