Change search
Link to record
Permanent link

Direct link
BETA
Publications (10 of 66) Show all publications
Ternström, S. & Pabon, P. (2019). Accounting for variability over the voice range. In: Martin Ochmann, Michael Vorländer, Janina Fels (Ed.), Proceedings of the ICA 2019 and EAA Euroregio: . Paper presented at ICA 2019 and EAA Euroregio - 23rd International Congress on Acoustics, integrating 4th EAA Euroregio 2019, 9-13 September, 2019, Aachen, Germany (pp. 7775-7780). Aachen, DE: Deutsche Gesellschaft für Akustik (DEGA e.V.)
Open this publication in new window or tab >>Accounting for variability over the voice range
2019 (English)In: Proceedings of the ICA 2019 and EAA Euroregio / [ed] Martin Ochmann, Michael Vorländer, Janina Fels, Aachen, DE: Deutsche Gesellschaft für Akustik (DEGA e.V.) , 2019, p. 7775-7780Conference paper, Published paper (Refereed)
Abstract [en]

Researchers from the natural sciences interested in the performing arts often seek quantitative findings with explanatory power and practical relevance to performers and educators. However, the complexity of singing voice production continues to challenge us. On their own, entities that are readily measurable in the domain of physics are rarely of direct relevance to excellence in the domain of performance; because information on one level of representation (e.g., acoustic) is artistically meaningful mostly when interpreted in a context at a higher level of representation (e.g., emotional or semantic). Also, practically any acoustic or physiologic metric derived from the sound of a voice, or from other signals or images, will exhibit considerable variation both across individuals and across the voice range, from soft to loud or from low to high pitch. Here, we review some recent research based on the sampling paradigm of the voice field, also known as the voice range profile. Despite large inter-subject variation, the localizing by fo and SPL in the voice field will make the recorded values very reproducible within subjects. We demonstrate some technical possibilities, and argue the importance of making physical measurements that provide a more encompassing and individual-centric view of singing voice production.

Place, publisher, year, edition, pages
Aachen, DE: Deutsche Gesellschaft für Akustik (DEGA e.V.), 2019
Keywords
variability, voice analysis, voice range profile
National Category
Signal Processing
Research subject
Speech and Music Communication
Identifiers
urn:nbn:se:kth:diva-259393 (URN)978-3-939296-15-7 (ISBN)
Conference
ICA 2019 and EAA Euroregio - 23rd International Congress on Acoustics, integrating 4th EAA Euroregio 2019, 9-13 September, 2019, Aachen, Germany
Projects
Phonatory Dynamics and States
Funder
Swedish Research Council, 2010-4565
Note

Overview of a new research paradigm. QC 20191018

Available from: 2019-09-15 Created: 2019-09-15 Last updated: 2019-10-18Bibliographically approved
Lã, F. M. .. & Ternström, S. (2019). Flow ball-assisted training: immediate effects on vocal fold contacting. In: Jenny Iwarsson, Stine Løvind Thorsen (Ed.), Pan-European Voice Conference 2019: . Paper presented at Pan-European Voice Conference, Copenhagen, 28-31 August 2019 (pp. 50-51). University of Copenhagen
Open this publication in new window or tab >>Flow ball-assisted training: immediate effects on vocal fold contacting
2019 (English)In: Pan-European Voice Conference 2019 / [ed] Jenny Iwarsson, Stine Løvind Thorsen, University of Copenhagen , 2019, p. 50-51Conference paper, Oral presentation with published abstract (Refereed)
Abstract [en]

Background: The flow ball is a device that creates a static backpressure in the vocal tract while providing real-time visual feedback of airflow. A ball height of 0 to 10 cm corresponds to airflows of 0.2 to 0.4. L/s. These high airflows with low transglottal pressure correspond to low flow resistances, similar to the ones obtained when phonating into straws with 3.7 mm diameter and of 2.8 cm length. Objectives: To investigate whether there are immediate effects of flow ball-assisted training on vocal fold contact. Methods: Ten singers (five males and five females) performed a messa di voce at different pitches over one octave in three different conditions: before, during and after phonating with a flow ball. For all conditions, both audio and electrolaryngographic (ELG) signals were simultaneously recorded using a Laryngograph microprocessor. The vocal fold contact quotient Qci (the area under the normalized EGG cycle) and dEGGmaxN (the normalized maximum rate of change of vocal fold contact area) were obtained for all EGG cycles, using the FonaDyn system. We introduce also a compound metric Ic ,the ‘index of contact’ [Qci × log10(dEGGmaxN)], with the properties that it goes to zero at no contact. It combines information from both Qci and dEGGmaxN and thus it is comparable across subjects. The intra-subject means of all three metrics were computed and visualized by colour-coding over the fo-SPL plane, in cells of 1 semitone × 1 dB. Results: Overall, the use of flow ball-assisted phonation had a small yet significant effect on overall vocal fold contact across the whole messa di voce exercise. Larger effects were evident locally, i.e., in parts of the voice range. Comparing the pre-post flow-ball conditions, there were differences in Qci and/or dEGGmaxN. These differences were generally larger in male than in female voices. Ic typically decreased after flow ball use, for males but not for females. Conclusion: Flow ball-assisted training seems to modify vocal fold contacting gestures, especially in male singers.

Place, publisher, year, edition, pages
University of Copenhagen, 2019
Keywords
electroglottography, semi-occluded vocal tract, singing voice
National Category
Signal Processing
Research subject
Speech and Music Communication
Identifiers
urn:nbn:se:kth:diva-259394 (URN)
Conference
Pan-European Voice Conference, Copenhagen, 28-31 August 2019
Projects
Phonatory Dynamics and States
Note

QC 20191112

Available from: 2019-09-15 Created: 2019-09-15 Last updated: 2019-11-12Bibliographically approved
Ternström, S. (2019). Normalized time-domain parameters for electroglottographic waveforms. Journal of the Acoustical Society of America, 146(1), EL65-EL70, Article ID 1.5117174.
Open this publication in new window or tab >>Normalized time-domain parameters for electroglottographic waveforms
2019 (English)In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 146, no 1, p. EL65-EL70, article id 1.5117174Article in journal (Refereed) Published
Abstract [en]

The electroglottographic waveform is of interest for characterizing phonation non-invasively. Existing parameterizations tend to give disparate results because they rely on somewhat arbitrary thresholds and/or contacting events. It is shown that neither are needed for formulating a normalized contact quotient and a normalized peak derivative. A heuristic combination of the two resolves also the ambiguity of a moderate contact quotient, with regard to vocal fold contacting being firm versus weak or absent. As preliminaries, schemes for electroglottography signal preconditioning and time-domain period detection are described that improve somewhat on similar methods. The algorithms are simple and compute quickly.

Place, publisher, year, edition, pages
Acoustical Society of America (ASA), 2019
Keywords
electroglottography; voice range profile
National Category
Signal Processing
Research subject
Speech and Music Communication
Identifiers
urn:nbn:se:kth:diva-255129 (URN)10.1121/1.5117174 (DOI)000478628800017 ()31370590 (PubMedID)2-s2.0-85069510312 (Scopus ID)
Funder
Swedish Research Council, 2010-4565
Note

QC 20190820

Available from: 2019-07-21 Created: 2019-07-21 Last updated: 2019-08-20Bibliographically approved
Friberg, A., Lindeberg, T., Hellwagner, M., Helgason, P., Salomão, G. L., Elovsson, A., . . . Ternström, S. (2018). Prediction of three articulatory categories in vocal sound imitations using models for auditory receptive fields. Journal of the Acoustical Society of America, 144(3), 1467-1483
Open this publication in new window or tab >>Prediction of three articulatory categories in vocal sound imitations using models for auditory receptive fields
Show others...
2018 (English)In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 144, no 3, p. 1467-1483Article in journal (Refereed) Published
Abstract [en]

Vocal sound imitations provide a new challenge for understanding the coupling between articulatory mechanisms and the resulting audio. In this study, we have modeled the classification of three articulatory categories, phonation, supraglottal myoelastic vibrations, and turbulence from audio recordings. Two data sets were assembled, consisting of different vocal imitations by four professional imitators and four non-professional speakers in two different experiments. The audio data were manually annotated by two experienced phoneticians using a detailed articulatory description scheme. A separate set of audio features was developed specifically for each category using both time-domain and spectral methods. For all time-frequency transformations, and for some secondary processing, the recently developed Auditory Receptive Fields Toolbox was used. Three different machine learning methods were applied for predicting the final articulatory categories. The result with the best generalization was found using an ensemble of multilayer perceptrons. The cross-validated classification accuracy was 96.8 % for phonation, 90.8 % for supraglottal myoelastic vibrations, and 89.0 % for turbulence using all the 84 developed features. A final feature reduction to 22 features yielded similar results.

Place, publisher, year, edition, pages
Acoustical Society of America (ASA), 2018
Keywords
vocal articulation, sound imitations, signal processing, auditory receptive fields, turbulence, phonation, supraglottal myoelastic vibration, partial least-square regression, support vector classification, ensemble learning
National Category
Signal Processing Computer and Information Sciences
Research subject
Speech and Music Communication
Identifiers
urn:nbn:se:kth:diva-234295 (URN)10.1121/1.5052438 (DOI)000457802200049 ()2-s2.0-85053873907 (Scopus ID)
Funder
EU, FP7, Seventh Framework Programme, 618067
Note

QC 20181003

Available from: 2018-09-06 Created: 2018-09-06 Last updated: 2019-02-22Bibliographically approved
Degirmenci, N. C., Jansson, J., Hoffman, J., Arnela, M., Sánchez-Martín, P., Guasch, O. & Ternström, S. (2017). A Unified Numerical Simulation of Vowel Production That Comprises Phonation and the Emitted Sound. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2017: . Paper presented at 18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017, Stockholm, Sweden, 20 August 2017 through 24 August 2017 (pp. 3492-3496). The International Speech Communication Association (ISCA)
Open this publication in new window or tab >>A Unified Numerical Simulation of Vowel Production That Comprises Phonation and the Emitted Sound
Show others...
2017 (English)In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2017, The International Speech Communication Association (ISCA), 2017, p. 3492-3496Conference paper, Published paper (Refereed)
Abstract [en]

A unified approach for the numerical simulation of vowels is presented, which accounts for the self-oscillations of the vocal folds including contact, the generation of acoustic waves and their propagation through the vocal tract, and the sound emission outwards the mouth. A monolithic incompressible fluid-structure interaction model is used to simulate the interaction between the glottal jet and the vocal folds, whereas the contact model is addressed by means of a level set application of the Eikonal equation. The coupling with acoustics is done through an acoustic analogy stemming from a simplification of the acoustic perturbation equations. This coupling is one-way in the sense that there is no feedback from the acoustics to the flow and mechanical fields. All the involved equations are solved together at each time step and in a single computational run, using the finite element method (FEM). As an application, the production of vowel [i] has been addressed. Despite the complexity of all physical phenomena to be simulated simultaneously, which requires resorting to massively parallel computing, the formant locations of vowel [i] have been well recovered.

Place, publisher, year, edition, pages
The International Speech Communication Association (ISCA), 2017
Series
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, ISSN 2308-457X ; 2017
Keywords
Numerical voice production, phonation, vocal tract acoustics, fluid-structure interaction, finite element method
National Category
Fluid Mechanics and Acoustics
Research subject
Applied and Computational Mathematics
Identifiers
urn:nbn:se:kth:diva-219554 (URN)10.21437/Interspeech.2017-1239 (DOI)2-s2.0-85039159138 (Scopus ID)
Conference
18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017, Stockholm, Sweden, 20 August 2017 through 24 August 2017
Projects
Eunison
Funder
EU, FP7, Seventh Framework Programme, 308874
Note

QC 20171211

Available from: 2017-12-07 Created: 2017-12-07 Last updated: 2019-10-17Bibliographically approved
Selamtzis, A. & Ternström, S. (2017). Investigation of the relationship between electroglottogram waveform, fundamental frequency, and sound pressure level using clustering. Journal of Voice, 31(4), 393-400
Open this publication in new window or tab >>Investigation of the relationship between electroglottogram waveform, fundamental frequency, and sound pressure level using clustering
2017 (English)In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 31, no 4, p. 393-400Article in journal (Refereed) Published
Abstract [en]

Although it has been shown in previous research (Orlikoff, 1991; Henrich et al, 2005; Kuang et al, 2014; Awan, 2015) that there exists a relationship between the electroglottogram (EGG) waveform and the acoustic signal, this relationship is still not fully understood. To investigate this relationship, the EGG and acoustic signals were measured for four male amateur choir singers who each produced eight consecutive tones of increasing and decreasing vocal intensity. The EGG signals were processed cycle-synchronously to obtain the discrete Fourier transform, and the data were used as an input to a clustering algorithm. The acoustic signal was analyzed in terms of sound pressure level (dB SPL) and fundamental frequency (f(o)) of vibration, and the results of both EGG and acoustic analysis were depicted on a two-dimensional plane with f(o) on the x-axis and SPL on the y-axis. All the subjects were seen to have a weak, near-sinusoidal EGG waveform in their lowest SPL range, whereas increase in SPL coincided with progressive enrichment in harmonic content of the EGG waveforms. The results of the clustering were additionally used to classify waveforms across subjects to enable inter-subject comparisons and assessment of individual strategies of exploring the f(o)-SPL dimensions. In these male subjects, the EGG waveform shape appeared to vary with SPL and to remain essentially constant with f(o) over one octave.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2017
National Category
Fluid Mechanics and Acoustics
Identifiers
urn:nbn:se:kth:diva-211744 (URN)10.1016/j.jvoice.2016.11.003 (DOI)000406147000001 ()27939138 (PubMedID)2-s2.0-85008154357 (Scopus ID)
Funder
Swedish Research Council, 2010-4565 2013-0642
Note

QC 20170815

Available from: 2017-08-15 Created: 2017-08-15 Last updated: 2018-01-25Bibliographically approved
Warhurst, S., Madill, C., McCabe, P., Ternström, S., Yiu, E. & Heard, R. (2017). Perceptual and Acoustic Analyses of Good Voice Quality in Male Radio Performers. Journal of Voice, 31(2), 259.e1-259.e12
Open this publication in new window or tab >>Perceptual and Acoustic Analyses of Good Voice Quality in Male Radio Performers
Show others...
2017 (English)In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 31, no 2, p. 259.e1-259.e12Article in journal (Refereed) Published
Abstract [en]

Objectives

Good voice quality is an asset to professional voice users, including radio performers. We examined whether (1) voices could be reliably categorized as good for the radio and (2) these categories could be predicted using acoustic measures.

Participants and Methods

Male radio performers (n = 24) and age-matched male controls performed “The Rainbow Passage” as if presenting on the radio. Voice samples were rated using a three-stage paired-comparison paradigm by 51 naive listeners and perceptual categories were identified (Study 1), and then analyzed for fundamental frequency, long-term average spectrum, cepstral peak prominence, and pause or spoken-phrase duration (Study 2).

Results

Study 1: Good inter-judge reliability was found for perceptual judgments of the best 15 voices (good for radio category, 14/15 = radio performers), but agreement on the remaining 33 voices (unranked category) was poor. Study 2: Discriminant function analyses showed that the SD standard deviation of sounded portion duration, equivalent sound level, and smoothed cepstral peak prominence predicted membership of categories with moderate accuracy (R2 = 0.328).

Conclusions

Radio performers are heterogeneous for voice quality; good voice quality was judged reliably in only 14 out of 24 radio performers. Current acoustic analyses detected some of the relevant signal properties that were salient in these judgments. More refined perceptual analysis and the use of other perceptual methods might provide more information on the complex nature of judging good voices.

Place, publisher, year, edition, pages
Elsevier, 2017
Keywords
voice, performance, supranormal, broadcaster, perceptual analysis
National Category
Media and Communication Technology
Research subject
Speech and Music Communication
Identifiers
urn:nbn:se:kth:diva-192321 (URN)10.1016/j.jvoice.2016.05.016 (DOI)000397918800069 ()2-s2.0-85006817447 (Scopus ID)
Note

QC 20160929

Available from: 2016-09-09 Created: 2016-09-09 Last updated: 2019-10-09Bibliographically approved
Gustafsson, J., Ternström, S., Södersten, M. & Schalling, E. (2016). Motor-Learning-Based Adjustment of Ambulatory Feedback on Vocal Loudness for Patients With Parkinson's Disease. Journal of Voice, 30(4), 407-415
Open this publication in new window or tab >>Motor-Learning-Based Adjustment of Ambulatory Feedback on Vocal Loudness for Patients With Parkinson's Disease
2016 (English)In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 30, no 4, p. 407-415Article in journal (Refereed) Published
Abstract [en]

Objectives: To investigate how the direct biofeedback on vocal loudness administered with a portable voice accumulator (VoxLog) should be configured, to facilitate an optimal learning outcome for individuals with Parkinson's disease (PD), on the basis of principles of motor learning. Study Design: Methodologic development in an experimental study. Methods: The portable voice accumulator VoxLog was worn by 20 participants with PD during habitual speech during semistructured conversations. Six different biofeedback configurations were used, in random order, to study which configuration resulted in a feedback frequency closest to 20% as recommended on the basis of previous studies. Results: Activation of feedback when the wearer speaks below a threshold level of 3dB below the speaker's mean voice sound level in habitual speech combined with an activation time of 500ms resulted in a mean feedback frequency of 21.2%. Conclusions: Settings regarding threshold and activation time based on the results from this study are recommended to achieve an optimal learning outcome when administering biofeedback on vocal loudness for individuals with PD using portable voice accumulators.

Place, publisher, year, edition, pages
Elsevier, 2016
Keywords
Biofeedback, Motor learning, Parkinson's disease, Portable voice accumulators, Voice sound level
National Category
Otorhinolaryngology Neurology
Identifiers
urn:nbn:se:kth:diva-175645 (URN)10.1016/j.jvoice.2015.06.003 (DOI)000379526100005 ()2-s2.0-84936866650 (Scopus ID)
Note

QC 20151026

Available from: 2015-10-26 Created: 2015-10-19 Last updated: 2017-12-13Bibliographically approved
Ternström, S., Pabon, P. & Södersten, M. (2016). The Voice Range Profile: its function, applications, pitfalls and potential. Acta Acoustica united with Acustica, 102(2), 268-283
Open this publication in new window or tab >>The Voice Range Profile: its function, applications, pitfalls and potential
2016 (English)In: Acta Acoustica united with Acustica, ISSN 1610-1928, E-ISSN 1861-9959, Vol. 102, no 2, p. 268-283Article in journal (Refereed) Published
Abstract [en]

An overview is given of the current status of the computerised voice range profile (VRP) as a voice measurement paradigm. Its operating principles are described, and sources of errors and variability are discussed. The features of the VRP contour and its characterisaï¿œtion are described. Methods for performing statistics on VRP contour and interior data are considered. Examples are given of clinical, pedagogical and research applications. Finally, issues with the models used to interpret VRP data are discussed. It is concluded that, while the VRP offers a convenient frame of reference for a multitude of voice assessment metrics, it also exposes the many degrees of freedom in the voice to an extent that challenges us to improve our models of how the voice functions over a large range and in a dynamic setting.

Place, publisher, year, edition, pages
S. Hirzel Verlag, 2016
Keywords
voice, voice analysis, voice range profile
National Category
Signal Processing
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-183406 (URN)10.3813/AAA.918943 (DOI)000372478500008 ()2-s2.0-84961590849 (Scopus ID)
Funder
Swedish Research Council, 2010-4565Forte, Swedish Research Council for Health, Working Life and Welfare, 2002-0416
Note

QC 20160316

Available from: 2016-03-10 Created: 2016-03-10 Last updated: 2018-10-05Bibliographically approved
Södersten, M., Salomão, G. L., McAllister, A. & Ternström, S. (2015). Natural Voice Use in Patients With Voice Disorders and Vocally Healthy Speakers Based on 2 Days Voice Accumulator Information From a Database. Journal of Voice, 29(5), 646.e11-646.e19
Open this publication in new window or tab >>Natural Voice Use in Patients With Voice Disorders and Vocally Healthy Speakers Based on 2 Days Voice Accumulator Information From a Database
2015 (English)In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, ISSN 0892-1997, Vol. 29, no 5, p. 646.e11-646.e19Article in journal (Refereed) Published
Abstract [en]

Objectives and Study Design. Information about how patients with voice disorders use their voices in natural communicative situations is scarce. Such long-term data have for the first time been uploaded to a central database from different hospitals in Sweden. The purpose was to investigate the potential use of a large set of long-term data for establishing reference values regarding voice use in natural situations. Methods. VoxLog (Sonvox AB, Umeå, Sweden) was tested for deployment in clinical practice by speech-language pathologists working at nine hospitals in Sweden. Files from 20 patients (16 females and 4 males) with functional, organic, or neurological voice disorders and 10 vocally healthy individuals (eight females and two males) were uploaded to a remote central database. All participants had vocally demanding occupations and had been monitored for more than 2 days. The total recording time was 681 hours and 50 minutes. Data on fundamental frequency (F0, Hz), phonation time (seconds and percentage), voice sound pressure level (SPL, dB), and background noise level (dB) were analyzed for each recorded day and compared between the 2 days. Variations across each day were measured using coefficients of variation. Results. Average F0, voice SPL, and especially the level of background noise varied considerably for all participants across each day. Average F0 and voice SPL were considerably higher than reference values from laboratory recordings. Conclusions. The use of a remote central database and strict protocols can accelerate data collection from larger groups of participants and contribute to establishing reference values regarding voice use in natural situations and from patients with voice disorders. Information about activities and voice symptoms would supplement the objective data and is recommended in future studies.

Place, publisher, year, edition, pages
Elsevier, 2015
Keywords
Accelerometer, Fundamental frequency, Phonation time, Vocal loading, Voice accumulator, Voice disorders, Voice SPL
National Category
Medical Equipment Engineering
Research subject
Speech and Music Communication
Identifiers
urn:nbn:se:kth:diva-158172 (URN)10.1016/j.jvoice.2014.09.006 (DOI)000360556700023 ()2-s2.0-84941317109 (Scopus ID)
Funder
VINNOVA
Note

QC 20151006. Updated from accepted to published. QC 20160222

Available from: 2014-12-30 Created: 2014-12-30 Last updated: 2017-12-05Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-3362-7518

Search in DiVA

Show all publications