kth.sePublications KTH
Change search
Link to record
Permanent link

Direct link
Publications (10 of 137) Show all publications
Lã, F. M. .., Sundberg, J. & Barreda, S. (2026). Effect of Phonation Type on Maximum Phonation Time. Journal of Voice
Open this publication in new window or tab >>Effect of Phonation Type on Maximum Phonation Time
2026 (English)In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588Article in journal (Refereed) Epub ahead of print
Abstract [en]

Maximum phonation time (MPT) is commonly used as an indication of phonatory function. However, MPT varies considerably depending on several factors, for instance, laryngeal valving, ie, the completeness of glottal closure. This parameter is closely related to phonation type, which can vary along the continuum between weak and firm glottal adduction, resulting in an abundant and reduced glottal airflow, respectively, ie, Breathy and Pressed phonation. This investigation analyzes the effects of both phonation type and flow rate on MPT, hypothesizing that Pressed and Breathy phonations produce extremes of MPT variation.Audio and lung volume were recorded in 14 singers experienced in performing different singing genres. They sustained the vowel /a/ as long as they could, after a maximum inhalation, a task repeated in Breathy, Flow, Neutral, and Pressed phonations. Real-time visual feedback of sound level and fundamental frequency was provided by means of the FonaDyn software, helping participants to keep these parameters constant across different degrees of vocal fold adduction. The relationship between flow rate, phonation, and MPT was analyzed using multilevel Bayesian models. Such models offer a better quantification of uncertainty, full probability distributions, and the ability to integrate the results of previous experiments into current analyses as compared to equivalent frequentist models. The results suggested that MPT varies with both flow rate and phonation type: the former is a stronger standalone predictor of MPT than the latter. The implication of such a finding is that, when access to flow data is not available, MPT is still a useful metric, provided that control for phonation type is considered. Indeed, much of the variation of published MPT data may reflect phonation type differences. Thus, future investigations should control for phonation type when MPT data are compared.

Place, publisher, year, edition, pages
Elsevier BV, 2026
Keywords
Flow rate, Maximum phonation time, Phonation type
National Category
Clinical Medicine Fluid Mechanics
Identifiers
urn:nbn:se:kth:diva-378537 (URN)10.1016/j.jvoice.2026.01.026 (DOI)41667340 (PubMedID)2-s2.0-105032161761 (Scopus ID)
Note

QC 20260325

Available from: 2026-03-25 Created: 2026-03-25 Last updated: 2026-03-25Bibliographically approved
Włodarczak, M., Ludusan, B., Sundberg, J. & Heldner, M. (2025). Classification of voice quality using neck-surface acceleration: Comparison with glottal flow and radiated sound. Journal of Voice, 39(1), 10-24
Open this publication in new window or tab >>Classification of voice quality using neck-surface acceleration: Comparison with glottal flow and radiated sound
2025 (English)In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 39, no 1, p. 10-24Article in journal (Refereed) Published
Abstract [en]

Objectives: The aim of the present study is to investigate the usefulness of features extracted from miniature accelerometers attached to speaker's tracheal wall below the glottis for classification of phonation type. The performance of the accelerometer features is evaluated relative to features obtained from inverse filtered and radiated sound. While the former is a good proxy for the voice source, obtaining robust voice source features from the latter is considered difficult since it also contains information about the vocal tract filter. By contrast, the accelerometer signal is largely unaffected by the vocal tract and although it is shaped by subglottal resonances and the transfer properties of the neck tissue, these properties remain constant within a speaker. For this reason, we expect it to provide a better approximation of the voice source than the raw audio. We also investigate which aspects of the voice source are derivable from the accelerometer and microphone signals. Methods: Five trained singers (two females and three males) were recorded producing the syllable [pæ:] in three voice qualities (neutral, breathy and pressed) and at three pitch levels as determined by the participants’ personal preference. Features extracted from the three signals were used for classification of phonation type using a random forest classifier. In addition, accelerometer and microphone features with highest correlation with the voice source features were identified. Results: The three signals showed comparable classification error rates, with considerable differences across speakers both with respect to the overall performance and the importance of individual features. The speaker-specific differences notwithstanding, variation of phonation type had consistent effects on the voice source, accelerometer and audio signals. With regard to the voice source, AQ, NAQ, L1L2 and CQ all showed a monotonic variation along the breathy – neutral – pressed continuum. Several features were also found to vary systematically in the accelerometer and audio signals: HRF, L1L2 and CPPS (both the accelerometer and the audio), as well as the sound level (for the audio). The random forest analysis revealed that all of these features were also among the most important for the classification of voice quality. Conclusion: Both the accelerometer and the audio signals were found to discriminate between phonation types with an accuracy approaching that of the voice source. Thus, the accelerometer signal, which is largely uncontaminated by vocal tract resonances, offered no advantage over the signal collected with a normal microphone.

Place, publisher, year, edition, pages
Elsevier BV, 2025
Keywords
accelerometer, audio, phonation type classification, voice source
National Category
Signal Processing Natural Language Processing
Identifiers
urn:nbn:se:kth:diva-335782 (URN)10.1016/j.jvoice.2022.06.034 (DOI)001414592600001 ()36028369 (PubMedID)2-s2.0-85136510333 (Scopus ID)
Note

QC 20250226

Available from: 2023-09-07 Created: 2023-09-07 Last updated: 2025-02-26Bibliographically approved
Havel, M. & Sundberg, J. (2025). Ex-vivo and replica measurements of nasal tract resonances. Rhinology, 63(6), Article ID 382.
Open this publication in new window or tab >>Ex-vivo and replica measurements of nasal tract resonances
2025 (English)In: Rhinology, ISSN 0300-0729, E-ISSN 1996-8604, Vol. 63, no 6, article id 382Article in journal (Refereed) Published
Abstract [en]

Background: The nose is a resonator, the acoustic properties of which are determined by its shape. Due to its complex anatomy and hence intricate acoustical response the identification of universal acoustic characteristics of nasalized vowels and consonants is challenging. The purpose of this investigation was to 1) elucidate acoustic properties of the nasal resonator, 2) document how the paranasal sinuses affect it, and 3) examine if 3D-replicas of anatomical specimens provide reliable data for acoustic analysis. Methods: In this experimental study the resonance properties of the nasal tract were analyzed in ex-vivo specimens as well as in their 3-D replicas. Their sound transfer characteristics were recorded by sending a sinewave, gliding from low to high frequency from an earphone airtightly sealed into the velopharyngeal port. The response was picked up at a nostril. The acoustical influence of the sinuses was reversibly eliminated by occlusion of the sinus ostia. Results: Response curves of the nasal tract were found to possess two main resonances, one in the vicinity of 600-750 Hz and one in the 2500-3500 Hz range. Comparison of the acoustical responses obtained while including and excluding the influence of the paranasal cavities showed a great inter-individual variation in the response curve morphology. The cavities were found to introduce V-shaped sound level minima in the response curves. Conclusions: When the influence of the paranasal cavities is eliminated, the nasal cavity presents two main resonances, which are determined mainly by its anatomical length. The resonances of the paranasal cavities introduce minima and maxima in the frequency response of the nasal tract at frequencies with substantial inter-individual variation. Replicas of anatomical specimens provide reliable data for acoustic analysis.

Place, publisher, year, edition, pages
Stichting Nase, 2025
Keywords
nasal tract, paranasal sinuses, resonance, response curve, transfer function
National Category
Oto-rhino-laryngology
Identifiers
urn:nbn:se:kth:diva-378289 (URN)10.4193/Rhin24.382 (DOI)001633544100006 ()41143434 (PubMedID)2-s2.0-105025150720 (Scopus ID)
Note

QC 20260319

Available from: 2026-03-19 Created: 2026-03-19 Last updated: 2026-03-19Bibliographically approved
Walker, R. S., Fleischer, M., Sundberg, J., Bieber, M., Zabel, H. & Mürbe, D. (2025). Retrospective longitudinal analysis of spectral features reveals divergent vocal development patterns for treble and non-treble singers. Journal of the Acoustical Society of America, 158(3), 1989-1998
Open this publication in new window or tab >>Retrospective longitudinal analysis of spectral features reveals divergent vocal development patterns for treble and non-treble singers
Show others...
2025 (English)In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 158, no 3, p. 1989-1998Article in journal (Refereed) Published
Abstract [en]

This study investigates the longitudinal spectral development of male and female classical singers throughout conservatory training. While classical singing techniques share commonalities across voice types, physiological differences have led to gender-specific pedagogy. Previous acoustic research has explored differences in resonance strategies between genders and voice types; however, little is known about how these spectral characteristics develop during vocal training. In this retrospective longitudinal study, recordings from 117 classical voice students at the Hochschule für Musik Carl Maria von Weber Dresden were analyzed. Recordings spanned 2008-2018 during the students' 4-year bachelor studies. Countertenors were analyzed with sopranos, mezzo-sopranos, and altos as “treble” voices and tenor, baritone, and bass voices were analyzed as “non-treble” voices. Spectral measures were assessed from three different vocal exercises using long-term average spectrum. Statistical analysis utilized linear mixed-effect models to explore the effect of years of study, voice group (treble or non-treble), and their interaction. Findings reveal that the treble singers increasingly concentrated relative acoustic energy in the f <inf>0</inf> range of the sung exercise, while the non-treble singers increasingly concentrated relative acoustic energy above 1000 Hz. Additionally, female singers exhibited increased vocal periodicity over time across all tasks, suggesting a reduction in breathiness.

Place, publisher, year, edition, pages
Acoustical Society of America (ASA), 2025
National Category
Music Natural Language Processing
Identifiers
urn:nbn:se:kth:diva-370411 (URN)10.1121/10.0039243 (DOI)001570672200004 ()40932292 (PubMedID)2-s2.0-105015496517 (Scopus ID)
Note

QC 20250926

Available from: 2025-09-26 Created: 2025-09-26 Last updated: 2025-09-26Bibliographically approved
Baker, C. P., Sundberg, J., Purdy, S. C., Rakena, T. O. & Leão, S. H. (2024). CPPS and Voice-Source Parameters: Objective Analysis of the Singing Voice. Journal of Voice, 38(3), 549-560
Open this publication in new window or tab >>CPPS and Voice-Source Parameters: Objective Analysis of the Singing Voice
Show others...
2024 (English)In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 38, no 3, p. 549-560Article in journal (Refereed) Published
Abstract [en]

Introduction: In recent years cepstral analysis and specific cepstrum-based measures such as smoothed cepstral peak prominence (CPPS) has become increasingly researched and utilized in attempts to determine the extent of overall dysphonia in voice signals. Yet, few studies have extensively examined how specific voice-source parameters affect CPPS values. Objective: Using a range of synthesized tones, this exploratory study sought to systematically analyze the effect of fundamental frequency (fo), vibrato extent, source-spectrum tilt, and the amplitude of the voice-source fundamental on CPPS values. Materials and Methods: A series of scales were synthesised using the freeware Madde. Fundamental frequency, vibrato extent, source-spectrum tilt, and the amplitude of the voice-source fundamental were systematically and independently varied. The tones were analysed in PRAAT, and statistical analyses were conducted in SPSS. Results: CPPS was significantly affected by both fo and source-spectrum tilt, independently. A nonlinear association was seen between vibrato extent and CPPS, where CPPS values increased from 0 to 0.6 semitones (ST), then rapidly decreased approaching 1.0 ST. No relationship was seen between the amplitude of the voice-source fundamental and CPPS. Conclusion: The large effect of fo should be taken into account when analyzing the voice, particularly in singing-voice research, when comparing pre and posttreatment data, and when comparing inter-subject CPPS data. 

Place, publisher, year, edition, pages
Elsevier BV, 2024
Keywords
Cepstral analysis, CPPS, Singing, Voice, Voice analysis
National Category
Gynaecology, Obstetrics and Reproductive Medicine
Identifiers
urn:nbn:se:kth:diva-318405 (URN)10.1016/j.jvoice.2021.12.010 (DOI)001235166500001 ()35000836 (PubMedID)2-s2.0-85122449473 (Scopus ID)
Note

QC 20240619

Available from: 2022-09-21 Created: 2022-09-21 Last updated: 2025-02-11Bibliographically approved
Sundberg, J., Salomão, G. L. & Scherer, K. R. (2024). Emotional expressivity in singing: Assessing physiological and acoustic indicators of two opera singers' voice characteristics. Journal of the Acoustical Society of America, 155(1), 18-28
Open this publication in new window or tab >>Emotional expressivity in singing: Assessing physiological and acoustic indicators of two opera singers' voice characteristics
2024 (English)In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 155, no 1, p. 18-28Article in journal (Refereed) Published
Abstract [en]

In an earlier study, we analyzed how audio signals obtained from three professional opera singers varied when they sang one octave wide eight-tone scales in ten different emotional colors. The results showed systematic variations in voice source and long-term-average spectrum (LTAS) parameters associated with major emotion “families”. For two of the singers, subglottal pressure (PSub) also was recorded, thus allowing analysis of an additional main physiological voice control parameter, glottal resistance (defined as the ratio between PSub and glottal flow), and related to glottal adduction. In the present study, we analyze voice source and LTAS parameters derived from the audio signal and their correlation with Psub and glottal resistance. The measured parameters showed a systematic relationship with the four emotion families observed in our previous study. They also varied systematically with values of the ten emotions along the valence, power, and arousal dimensions; valence showed a significant correlation with the ratio between acoustic voice source energy and subglottal pressure, while Power varied significantly with sound level and two measures related to the spectral dominance of the lowest spectrum partial. the fundamental.

Place, publisher, year, edition, pages
Acoustical Society of America, 2024
National Category
Natural Language Processing Music Fluid Mechanics
Identifiers
urn:nbn:se:kth:diva-342389 (URN)10.1121/10.0023938 (DOI)001135659200002 ()38169520 (PubMedID)2-s2.0-85181588072 (Scopus ID)
Note

QC 20240118

Available from: 2024-01-17 Created: 2024-01-17 Last updated: 2025-02-21Bibliographically approved
Baker, C. P., Sundberg, J., Purdy, S. C. & Rakena, T. O. (2024). Female adolescent singing voice characteristics: an exploratory study using LTAS and inverse filtering. Logopedics, Phoniatrics, Vocology, 49(2), 83-95
Open this publication in new window or tab >>Female adolescent singing voice characteristics: an exploratory study using LTAS and inverse filtering
2024 (English)In: Logopedics, Phoniatrics, Vocology, ISSN 1401-5439, E-ISSN 1651-2022, Vol. 49, no 2, p. 83-95Article in journal (Refereed) Published
Abstract [en]

Background and Aim: To date, little research is available that objectively quantifies female adolescent singing-voice characteristics in light of the physiological and functional developments that occur from puberty to adulthood. This exploratory study sought to augment the pool of data available that offers objective voice analysis of female singers in late adolescence. Methods: Using long-term average spectra (LTAS) and inverse filtering techniques, dynamic range and voice-source characteristics were determined in a cohort of vocally healthy cis-gender female adolescent singers (17 to 19 years) from high-school choirs in Aotearoa New Zealand. Non-parametric statistics were used to determine associations and significant differences. Results: Wide intersubject variation was seen between dynamic range, spectral measures of harmonic organisation (formant cluster prominence, FCP), noise components in the spectrum (high-frequency energy ratio, HFER), and the normalised amplitude quotient (NAQ) suggesting great variability in ability to control phonatory mechanisms such as subglottal pressure (Psub), glottal configuration and adduction, and vocal tract shaping. A strong association between the HFER and NAQ suggest that these non-invasive measures may offer complimentary insights into vocal function, specifically with regard to glottal adduction and turbulent noise in the voice signal. Conclusion: Knowledge of the range of variation within healthy adolescent singers is necessary for the development of effective and inclusive pedagogical practices, and for vocal-health professionals working with singers of this age. LTAS and inverse filtering are useful non-invasive tools for determining such characteristics. 

Place, publisher, year, edition, pages
Informa UK Limited, 2024
Keywords
breathiness, glottal adduction, normalised amplitude quotient, Singing voice analysis, voice pedagogy, adduction, adolescent, adult, article, choir (singing), cohort analysis, controlled study, exploratory research, female, filtration, gender, glottis, high school, human, male, New Zealand, noise, nonparametric test, pedagogics, voice, voice analysis
National Category
Music
Identifiers
urn:nbn:se:kth:diva-328933 (URN)10.1080/14015439.2022.2140455 (DOI)000878102700001 ()36322641 (PubMedID)2-s2.0-85141360070 (Scopus ID)
Note

QC 20251218

Available from: 2023-06-13 Created: 2023-06-13 Last updated: 2025-12-18Bibliographically approved
Rosenberg, S., Sundberg, J. & Lã, F. (2024). Kulning: Acoustic and Perceptual Characteristics of a Calling Style Used Within the Scandinavian Herding Tradition. Journal of Voice, 38(3), 585-594
Open this publication in new window or tab >>Kulning: Acoustic and Perceptual Characteristics of a Calling Style Used Within the Scandinavian Herding Tradition
2024 (English)In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 38, no 3, p. 585-594Article in journal (Refereed) Published
Abstract [en]

Kulning, a loud, high-pitched vocal calling technique pertaining to the Scandinavian herding system, has attracted several researchers' attention, mainly focusing on cultural, phonatory and musical aspects. Less attention has been paid to the spectral and physiological properties that characterize Kulning tones, and also if there is a physiologically optimum pitch range. We analyzed tones produced by ten participants with varying experience in Kulning. They performed a phrase, pitch range G5 to C6 (784 to 1046 Hz), in three different conditions: starting (1) on pitch A5, (2) on the participant's preferred pitch, and (3) after the deepest possible inhalation, also on the participant's preferred pitch subglottal pressure (Psub) was measured as the oral pressure during /p/-occlusion. The quality of the Kulning was rated by a group of experts. The highest-rated tones all had a sound pressure level (SPL) at 0.3 m exceeding 115 dB and a pitch higher than 1010 Hz, while the SPL of the lowest rated tones was less than 108 dB at a pitch below 900 Hz. A multiple regression analysis was performed to evaluate the relationship between the ratings and Psub), SPL, level of the fundamental and the frequency at which a spectrum envelope dip occurred. Highly rated tones were started at maximum lung volumes, and on participants’ preferred pitches. They all shared a high frequency of the spectrum envelope dip and a high level of the fundamental. In decreasing order of ratings, Condition 3 showed the highest values followed by Condition 2 and Condition 1. Each singer seemed to perform best within an individual Psub and pitch range. The relevance of the results to voice pedagogy, artistic, and compositional work is discussed.

Place, publisher, year, edition, pages
Elsevier BV, 2024
Keywords
Kulning, Sound pressure level, Spectrum characteristics, Subglottal pressure, Tone quality
National Category
Music General Language Studies and Linguistics
Identifiers
urn:nbn:se:kth:diva-319614 (URN)10.1016/j.jvoice.2021.11.016 (DOI)001236655100001 ()34991935 (PubMedID)2-s2.0-85123279942 (Scopus ID)
Note

QC 20240619

Available from: 2022-10-05 Created: 2022-10-05 Last updated: 2025-02-21Bibliographically approved
Havel, M., Sundberg, J., Traser, L., Burdumy, M. & Echternach, M. (2023). Effects of Nasalization on Vocal Tract Response Curve. Journal of Voice, 37(3), 339-347
Open this publication in new window or tab >>Effects of Nasalization on Vocal Tract Response Curve
Show others...
2023 (English)In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 37, no 3, p. 339-347Article in journal (Refereed) Published
Abstract [en]

Background: Earlier studies have shown that nasalization affects the radiated spectrum by modifying the vocal tract transfer function in a complex manner. Methods: Here we study this phenomenon by measuring sine-sweep response of 3-D models of the vowels /u, a, ᴂ, i/, derived from volumetric MR imaging, coupled by means of tubes of different lengths and diameters to a 3-D model of a nasal tract. Results: The coupling introduced a dip into the vocal tract transfer function. The dip frequency was close to the main resonance of the nasal tract, a result in agreement with the Fujimura & Lindqvist in vivo sweep tone measurements [Fujimura & Lindqvist, 1972]. With increasing size of the coupling tube the depth of the dip increased and the first formant peak either changed in frequency or was split by the dip. Only marginal effects were observed of the paranasal sinuses. For certain coupling tube sizes, the spectrum balance was changed, boosting the formant peaks in the 2 – 4 kHz range. Conclusion: A velopharyngeal opening introduces a dip in the transfer function at the main resonance of the nasal tract. Its depth increases with the area of the opening and its frequency rises in some vowels.

Place, publisher, year, edition, pages
Elsevier BV, 2023
Keywords
Vocal tract, Nasal tract, Velopharyngeal opening, Transfer function, Sine sweep excitation, Spectrum balance, article, controlled study, excitation, in vivo study, nuclear magnetic resonance imaging, paranasal sinus, vowel
National Category
Natural Language Processing
Identifiers
urn:nbn:se:kth:diva-307431 (URN)10.1016/j.jvoice.2021.02.013 (DOI)000990753200001 ()33773895 (PubMedID)2-s2.0-85103075122 (Scopus ID)
Note

QC 20250402

Available from: 2022-01-25 Created: 2022-01-25 Last updated: 2025-04-02Bibliographically approved
Sundberg, J., La, F. & Granqvist, S. (2023). Fundamental frequency disturbances in female and male singers' pitch glides through long tube with varied resistancesa. Journal of the Acoustical Society of America, 154(2), 801-807
Open this publication in new window or tab >>Fundamental frequency disturbances in female and male singers' pitch glides through long tube with varied resistancesa
2023 (English)In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 154, no 2, p. 801-807Article in journal (Refereed) Published
Abstract [en]

Source-filter interaction can disturb vocal fold vibration frequency. Resonance frequency/bandwidth ratios (Q-values) may affect such interaction. Occurrences of fundamental frequency (f(o)) disturbances were measured in ascending pitch glides produced by four female and five male singers phonating into a 70 cm long tube. Pitch glides were produced with varied resonance Q-values of the vocal tract + tube compound (VT + tube): (i) tube end open, (ii) tube end open with nasalization, and (iii) with a piece of cotton wool in the tube end (conditions Op, Ns, and Ct, respectively). Disturbances of f(o) were identified by calculating the derivative of the low-pass filtered f(o) curve. Resonance frequencies of the compound VT+tube system were determined from ringings and glottal aspiration noise observed in narrowband spectrograms. Disturbances of f(o) tended to occur when a partial was close to a resonance of the compound VT+tube system. The number of such disturbances was significantly lower when the resonance Q-values were reduced (conditions Ns and Ct), particularly for the males. In some participants, resonance Q-values seemed less influential, suggesting little effect of source-filter interaction. The study sheds light on factors affecting source-filter interaction and f(o) control and is, therefore, relevant to voice pedagogy and theory of voice production.

Place, publisher, year, edition, pages
Acoustical Society of America (ASA), 2023
National Category
Music
Identifiers
urn:nbn:se:kth:diva-334706 (URN)10.1121/10.0020569 (DOI)001045013800006 ()37556565 (PubMedID)2-s2.0-85167533243 (Scopus ID)
Note

QC 20230824

Available from: 2023-08-24 Created: 2023-08-24 Last updated: 2025-02-21Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-7234-7551

Search in DiVA

Show all publications