kth.sePublications
Change search
Link to record
Permanent link

Direct link
Publications (10 of 133) Show all publications
Sundberg, J., Salomão, G. L. & Scherer, K. R. (2024). Emotional expressivity in singing: Assessing physiological and acoustic indicators of two opera singers' voice characteristics. Journal of the Acoustical Society of America, 155(1), 18-28
Open this publication in new window or tab >>Emotional expressivity in singing: Assessing physiological and acoustic indicators of two opera singers' voice characteristics
2024 (English)In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 155, no 1, p. 18-28Article in journal (Refereed) Published
Abstract [en]

In an earlier study, we analyzed how audio signals obtained from three professional opera singers varied when they sang one octave wide eight-tone scales in ten different emotional colors. The results showed systematic variations in voice source and long-term-average spectrum (LTAS) parameters associated with major emotion “families”. For two of the singers, subglottal pressure (PSub) also was recorded, thus allowing analysis of an additional main physiological voice control parameter, glottal resistance (defined as the ratio between PSub and glottal flow), and related to glottal adduction. In the present study, we analyze voice source and LTAS parameters derived from the audio signal and their correlation with Psub and glottal resistance. The measured parameters showed a systematic relationship with the four emotion families observed in our previous study. They also varied systematically with values of the ten emotions along the valence, power, and arousal dimensions; valence showed a significant correlation with the ratio between acoustic voice source energy and subglottal pressure, while Power varied significantly with sound level and two measures related to the spectral dominance of the lowest spectrum partial. the fundamental.

Place, publisher, year, edition, pages
Acoustical Society of America, 2024
National Category
Language Technology (Computational Linguistics) Music Fluid Mechanics and Acoustics
Identifiers
urn:nbn:se:kth:diva-342389 (URN)10.1121/10.0023938 (DOI)001135659200002 ()38169520 (PubMedID)2-s2.0-85181588072 (Scopus ID)
Note

QC 20240118

Available from: 2024-01-17 Created: 2024-01-17 Last updated: 2024-02-01Bibliographically approved
Sundberg, J., La, F. & Granqvist, S. (2023). Fundamental frequency disturbances in female and male singers' pitch glides through long tube with varied resistancesa. Journal of the Acoustical Society of America, 154(2), 801-807
Open this publication in new window or tab >>Fundamental frequency disturbances in female and male singers' pitch glides through long tube with varied resistancesa
2023 (English)In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 154, no 2, p. 801-807Article in journal (Refereed) Published
Abstract [en]

Source-filter interaction can disturb vocal fold vibration frequency. Resonance frequency/bandwidth ratios (Q-values) may affect such interaction. Occurrences of fundamental frequency (f(o)) disturbances were measured in ascending pitch glides produced by four female and five male singers phonating into a 70 cm long tube. Pitch glides were produced with varied resonance Q-values of the vocal tract + tube compound (VT + tube): (i) tube end open, (ii) tube end open with nasalization, and (iii) with a piece of cotton wool in the tube end (conditions Op, Ns, and Ct, respectively). Disturbances of f(o) were identified by calculating the derivative of the low-pass filtered f(o) curve. Resonance frequencies of the compound VT+tube system were determined from ringings and glottal aspiration noise observed in narrowband spectrograms. Disturbances of f(o) tended to occur when a partial was close to a resonance of the compound VT+tube system. The number of such disturbances was significantly lower when the resonance Q-values were reduced (conditions Ns and Ct), particularly for the males. In some participants, resonance Q-values seemed less influential, suggesting little effect of source-filter interaction. The study sheds light on factors affecting source-filter interaction and f(o) control and is, therefore, relevant to voice pedagogy and theory of voice production.

Place, publisher, year, edition, pages
Acoustical Society of America (ASA), 2023
National Category
Music
Identifiers
urn:nbn:se:kth:diva-334706 (URN)10.1121/10.0020569 (DOI)001045013800006 ()37556565 (PubMedID)2-s2.0-85167533243 (Scopus ID)
Note

QC 20230824

Available from: 2023-08-24 Created: 2023-08-24 Last updated: 2023-08-24Bibliographically approved
Włodarczak, M., Ludusan, B., Sundberg, J. & Heldner, M. (2022). Classification of voice quality using neck-surface acceleration: Comparison with glottal flow and radiated sound. Journal of Voice
Open this publication in new window or tab >>Classification of voice quality using neck-surface acceleration: Comparison with glottal flow and radiated sound
2022 (English)In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588Article in journal (Refereed) Published
Abstract [en]

Objectives: The aim of the present study is to investigate the usefulness of features extracted from miniature accelerometers attached to speaker's tracheal wall below the glottis for classification of phonation type. The performance of the accelerometer features is evaluated relative to features obtained from inverse filtered and radiated sound. While the former is a good proxy for the voice source, obtaining robust voice source features from the latter is considered difficult since it also contains information about the vocal tract filter. By contrast, the accelerometer signal is largely unaffected by the vocal tract and although it is shaped by subglottal resonances and the transfer properties of the neck tissue, these properties remain constant within a speaker. For this reason, we expect it to provide a better approximation of the voice source than the raw audio. We also investigate which aspects of the voice source are derivable from the accelerometer and microphone signals. Methods: Five trained singers (two females and three males) were recorded producing the syllable [pæ:] in three voice qualities (neutral, breathy and pressed) and at three pitch levels as determined by the participants’ personal preference. Features extracted from the three signals were used for classification of phonation type using a random forest classifier. In addition, accelerometer and microphone features with highest correlation with the voice source features were identified. Results: The three signals showed comparable classification error rates, with considerable differences across speakers both with respect to the overall performance and the importance of individual features. The speaker-specific differences notwithstanding, variation of phonation type had consistent effects on the voice source, accelerometer and audio signals. With regard to the voice source, AQ, NAQ, L1L2 and CQ all showed a monotonic variation along the breathy – neutral – pressed continuum. Several features were also found to vary systematically in the accelerometer and audio signals: HRF, L1L2 and CPPS (both the accelerometer and the audio), as well as the sound level (for the audio). The random forest analysis revealed that all of these features were also among the most important for the classification of voice quality. Conclusion: Both the accelerometer and the audio signals were found to discriminate between phonation types with an accuracy approaching that of the voice source. Thus, the accelerometer signal, which is largely uncontaminated by vocal tract resonances, offered no advantage over the signal collected with a normal microphone.

Place, publisher, year, edition, pages
Elsevier BV, 2022
Keywords
accelerometer, audio, phonation type classification, voice source
National Category
Signal Processing Language Technology (Computational Linguistics)
Identifiers
urn:nbn:se:kth:diva-335782 (URN)10.1016/j.jvoice.2022.06.034 (DOI)2-s2.0-85136510333 (Scopus ID)
Note

QC 20230907

Available from: 2023-09-07 Created: 2023-09-07 Last updated: 2023-09-07Bibliographically approved
Baker, C. P., Sundberg, J., Purdy, S. C., Rakena, T. O. & Leão, S. H. (2022). CPPS and Voice-Source Parameters: Objective Analysis of the Singing Voice. Journal of Voice
Open this publication in new window or tab >>CPPS and Voice-Source Parameters: Objective Analysis of the Singing Voice
Show others...
2022 (English)In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588Article in journal (Refereed) Published
Abstract [en]

Introduction: In recent years cepstral analysis and specific cepstrum-based measures such as smoothed cepstral peak prominence (CPPS) has become increasingly researched and utilized in attempts to determine the extent of overall dysphonia in voice signals. Yet, few studies have extensively examined how specific voice-source parameters affect CPPS values. Objective: Using a range of synthesized tones, this exploratory study sought to systematically analyze the effect of fundamental frequency (fo), vibrato extent, source-spectrum tilt, and the amplitude of the voice-source fundamental on CPPS values. Materials and Methods: A series of scales were synthesised using the freeware Madde. Fundamental frequency, vibrato extent, source-spectrum tilt, and the amplitude of the voice-source fundamental were systematically and independently varied. The tones were analysed in PRAAT, and statistical analyses were conducted in SPSS. Results: CPPS was significantly affected by both fo and source-spectrum tilt, independently. A nonlinear association was seen between vibrato extent and CPPS, where CPPS values increased from 0 to 0.6 semitones (ST), then rapidly decreased approaching 1.0 ST. No relationship was seen between the amplitude of the voice-source fundamental and CPPS. Conclusion: The large effect of fo should be taken into account when analyzing the voice, particularly in singing-voice research, when comparing pre and posttreatment data, and when comparing inter-subject CPPS data. 

Place, publisher, year, edition, pages
Elsevier BV, 2022
Keywords
Cepstral analysis, CPPS, Singing, Voice, Voice analysis
National Category
Obstetrics, Gynecology and Reproductive Medicine
Identifiers
urn:nbn:se:kth:diva-318405 (URN)10.1016/j.jvoice.2021.12.010 (DOI)2-s2.0-85122449473 (Scopus ID)
Note

QC 20220921

Available from: 2022-09-21 Created: 2022-09-21 Last updated: 2024-01-09Bibliographically approved
Baker, C. P., Sundberg, J., Purdy, S. C. & Rakena, T. O. (2022). Female adolescent singing voice characteristics: an exploratory study using LTAS and inverse filtering. Logopedics, Phoniatrics, Vocology, 1-13
Open this publication in new window or tab >>Female adolescent singing voice characteristics: an exploratory study using LTAS and inverse filtering
2022 (English)In: Logopedics, Phoniatrics, Vocology, ISSN 1401-5439, E-ISSN 1651-2022, p. 1-13Article in journal (Refereed) Published
Abstract [en]

Background and Aim: To date, little research is available that objectively quantifies female adolescent singing-voice characteristics in light of the physiological and functional developments that occur from puberty to adulthood. This exploratory study sought to augment the pool of data available that offers objective voice analysis of female singers in late adolescence. Methods: Using long-term average spectra (LTAS) and inverse filtering techniques, dynamic range and voice-source characteristics were determined in a cohort of vocally healthy cis-gender female adolescent singers (17 to 19 years) from high-school choirs in Aotearoa New Zealand. Non-parametric statistics were used to determine associations and significant differences. Results: Wide intersubject variation was seen between dynamic range, spectral measures of harmonic organisation (formant cluster prominence, FCP), noise components in the spectrum (high-frequency energy ratio, HFER), and the normalised amplitude quotient (NAQ) suggesting great variability in ability to control phonatory mechanisms such as subglottal pressure (Psub), glottal configuration and adduction, and vocal tract shaping. A strong association between the HFER and NAQ suggest that these non-invasive measures may offer complimentary insights into vocal function, specifically with regard to glottal adduction and turbulent noise in the voice signal. Conclusion: Knowledge of the range of variation within healthy adolescent singers is necessary for the development of effective and inclusive pedagogical practices, and for vocal-health professionals working with singers of this age. LTAS and inverse filtering are useful non-invasive tools for determining such characteristics. 

Place, publisher, year, edition, pages
Informa UK Limited, 2022
Keywords
breathiness, glottal adduction, normalised amplitude quotient, Singing voice analysis, voice pedagogy, adduction, adolescent, adult, article, choir (singing), cohort analysis, controlled study, exploratory research, female, filtration, gender, glottis, high school, human, male, New Zealand, noise, nonparametric test, pedagogics, voice, voice analysis
National Category
Music
Identifiers
urn:nbn:se:kth:diva-328933 (URN)10.1080/14015439.2022.2140455 (DOI)000878102700001 ()36322641 (PubMedID)2-s2.0-85141360070 (Scopus ID)
Note

QC 20230613

Available from: 2023-06-13 Created: 2023-06-13 Last updated: 2023-06-13Bibliographically approved
Rosenberg, S., Sundberg, J. & Lã, F. (2022). Kulning: Acoustic and Perceptual Characteristics of a Calling Style Used Within the Scandinavian Herding Tradition. Journal of Voice
Open this publication in new window or tab >>Kulning: Acoustic and Perceptual Characteristics of a Calling Style Used Within the Scandinavian Herding Tradition
2022 (English)In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588Article in journal (Refereed) Published
Abstract [en]

Kulning, a loud, high-pitched vocal calling technique pertaining to the Scandinavian herding system, has attracted several researchers' attention, mainly focusing on cultural, phonatory and musical aspects. Less attention has been paid to the spectral and physiological properties that characterize Kulning tones, and also if there is a physiologically optimum pitch range. We analyzed tones produced by ten participants with varying experience in Kulning. They performed a phrase, pitch range G5 to C6 (784 to 1046 Hz), in three different conditions: starting (1) on pitch A5, (2) on the participant's preferred pitch, and (3) after the deepest possible inhalation, also on the participant's preferred pitch subglottal pressure (Psub) was measured as the oral pressure during /p/-occlusion. The quality of the Kulning was rated by a group of experts. The highest-rated tones all had a sound pressure level (SPL) at 0.3 m exceeding 115 dB and a pitch higher than 1010 Hz, while the SPL of the lowest rated tones was less than 108 dB at a pitch below 900 Hz. A multiple regression analysis was performed to evaluate the relationship between the ratings and Psub), SPL, level of the fundamental and the frequency at which a spectrum envelope dip occurred. Highly rated tones were started at maximum lung volumes, and on participants’ preferred pitches. They all shared a high frequency of the spectrum envelope dip and a high level of the fundamental. In decreasing order of ratings, Condition 3 showed the highest values followed by Condition 2 and Condition 1. Each singer seemed to perform best within an individual Psub and pitch range. The relevance of the results to voice pedagogy, artistic, and compositional work is discussed.

Place, publisher, year, edition, pages
Elsevier BV, 2022
Keywords
Kulning, Sound pressure level, Spectrum characteristics, Subglottal pressure, Tone quality
National Category
Music General Language Studies and Linguistics
Identifiers
urn:nbn:se:kth:diva-319614 (URN)10.1016/j.jvoice.2021.11.016 (DOI)34991935 (PubMedID)2-s2.0-85123279942 (Scopus ID)
Note

QC 20221005

Available from: 2022-10-05 Created: 2022-10-05 Last updated: 2022-10-05Bibliographically approved
Fornhammar, L., Sundberg, J., Fuchs, M. & Pieper, L. (2022). Measuring Voice Effects of Vibrato-Free and Ingressive Singing: A Study of Phonation Threshold Pressures. Journal of Voice, 36(4), 479-486
Open this publication in new window or tab >>Measuring Voice Effects of Vibrato-Free and Ingressive Singing: A Study of Phonation Threshold Pressures
2022 (English)In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 36, no 4, p. 479-486Article in journal (Refereed) Published
Abstract [en]

Background: Phonation threshold pressure (PTP), showing the lowest subglottal pressure producing vocal fold vibration, has been found useful for documenting various effects of phonatory conditions. The need for such documentation is relevant also to the teaching of singing, particularly in view of vocal demands raised in some contemporary as well as early music compositions. The aim of the present study was to test the usefulness of PTP measurement for evaluating phonatory effects of vibrato-free and ingressive singing in professional singers. Methods: PTP was measured at a middle, a high and a low pitch in two female and two male singers before and after recording voice range profiles (i) in habitual technique, ie, with vibrato, (ii) in vibrato-free, and (iii) in ingressive phonation. Effects on vocal fold status were examined by videolaryngostroboscopy. Results: After careful instruction of the singers, no problems were found in applying the PTP method. In some singers videolaryngostroboscopy showed effects after the experiment, eg, in terms of increased mucus and more complete glottal closure. After ingressive phonation PTP increased substantially at high pitch in one singer but changed marginally in the other singers. Conclusion: The method seems useful for assessing and interpreting effects of singing in different styles and as a part of voice diagnostics. Therefore, it seems worthwhile to automatize PTP measurement.

Place, publisher, year, edition, pages
Elsevier BV, 2022
Keywords
Extended vocal technique, Subglottal pressure, Videolaryngoscopy, Vocal loading, Voice range profile, adult, article, controlled study, female, glottis, male, mucus, phonation, pitch, pressure measurement, singing, vocal cord, voice
National Category
Otorhinolaryngology
Identifiers
urn:nbn:se:kth:diva-291722 (URN)10.1016/j.jvoice.2020.07.023 (DOI)000844161400008 ()33071148 (PubMedID)2-s2.0-85092633511 (Scopus ID)
Note

QC 20220912

Available from: 2021-03-18 Created: 2021-03-18 Last updated: 2023-10-16Bibliographically approved
Sundberg, J. (2022). Three applications of analysis-by-synthesis in music science. Journal of Creative Music Systems, 1(1)
Open this publication in new window or tab >>Three applications of analysis-by-synthesis in music science
2022 (English)In: Journal of Creative Music Systems, E-ISSN 2399-7656, Vol. 1, no 1Article in journal (Refereed) Published
Abstract [en]

The article describes how my research has applied the analysis-by-synthesis strategy to (1) the composition of melodies in the style of nursery tunes, (2) music performance and (3) singing. The descriptions are formulated as generative grammars, which consist of a set of ordered, context-dependent rules capable of producing sound examples. These examples readily reveal observable weaknesses in the descriptions, the origins of which can be traced in the rule system and eliminated. The grammar describing the compositional style of nursery tunes composed by A. Tegnér demonstrates the paramount relevance of a hierarchical structure. Principles underlying the transformation from a music score file to a synthesized performance are derived from recommendations by a violinist and music performance coach, and can thus be regarded as a description of his professional skills as musician and pedagogue. Also in this case the grammar demonstrates the relevance of a hierarchical structure in terms of grouping, and reflects the role of expectation in music listening. The rule system describing singing voice synthesis specifies acoustic characteristics of performance details. The descriptions are complemented by sound examples illustrating the effects of identified compositional and performance rules in the genres analysed.

Place, publisher, year, edition, pages
University of Huddersfield Press, 2022
National Category
Musicology Computer Sciences
Identifiers
urn:nbn:se:kth:diva-331153 (URN)10.5920/jcms.1044 (DOI)2-s2.0-85149749388 (Scopus ID)
Note

QC 20230707

Available from: 2023-07-07 Created: 2023-07-07 Last updated: 2023-07-07Bibliographically approved
Sundberg, J., Salomão, G. L. & Scherer, K. R. (2021). Analyzing Emotion Expression in Singing via Flow Glottograms, Long-Term-Average Spectra, and Expert Listener Evaluation. Journal of Voice, 35(1), 52-60
Open this publication in new window or tab >>Analyzing Emotion Expression in Singing via Flow Glottograms, Long-Term-Average Spectra, and Expert Listener Evaluation
2021 (English)In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 35, no 1, p. 52-60Article in journal (Refereed) Published
Abstract [en]

Background: Acoustic aspects of emotional expressivity in speech have been analyzed extensively during recent decades. Emotional coloring is an important if not the most important property of sung performance, and therefore strictly controlled. Hence, emotional expressivity in singing may promote a deeper insight into vocal signaling of emotions. Furthermore, physiological voice source parameters can be assumed to facilitate the understanding of acoustical characteristics. Method: Three highly experienced professional male singers sang scales on the vowel /ae/ or /a/ in 10 emotional colors (Neutral, Sadness, Tender, Calm, Joy, Contempt, Fear, Pride, Love, Arousal, and Anger). Sixteen voice experts classified the scales in a forced-choice listening test, and the result was compared with long-term-average spectrum (LTAS) parameters and with voice source parameters, derived from flow glottograms (FLOGG) that were obtained from inverse filtering the audio signal. Results: On the basis of component analysis, the emotions could be grouped into four “families”, Anger-Contempt, Joy-Love-Pride, Calm-Tender-Neutral and Sad-Fear. Recognition of the intended emotion families by listeners reached accuracy levels far beyond chance level. For the LTAS and FLOGG parameters, vocal loudness had a paramount influence on all. Also after partialing out this factor, some significant correlations were found between FLOGG and LTAS parameters. These parameters could be sorted into groups that were associated with the emotion families. Conclusions: (i) Both LTAS and FLOGG parameters varied significantly with the enactment intentions of the singers. (ii) Some aspects of the voice source are reflected in LTAS parameters. (iii) LTAS parameters affect listener judgment of the enacted emotions and the accuracy of the intended emotional coloring.

Place, publisher, year, edition, pages
Elsevier BV, 2021
Keywords
Classical tradition, Emotion families, Enacting, Loudness, Parameter groups, adult, anger, article, controlled study, decision making, fear, filtration, human, male, sadness, singing, voice, vowel
National Category
Language Technology (Computational Linguistics) Music
Identifiers
urn:nbn:se:kth:diva-263251 (URN)10.1016/j.jvoice.2019.08.007 (DOI)000616864700006 ()31543358 (PubMedID)2-s2.0-85072523482 (Scopus ID)
Note

QC 20191106

Available from: 2019-11-06 Created: 2019-11-06 Last updated: 2022-06-26Bibliographically approved
Lã, F., Sundberg, J. & Granqvist, S. (2021). Augmented visual-feedback of airflow: Immediate effects on voice-source characteristics of students of singing. Psychology of Music
Open this publication in new window or tab >>Augmented visual-feedback of airflow: Immediate effects on voice-source characteristics of students of singing
2021 (English)In: Psychology of Music, ISSN 0305-7356, E-ISSN 1741-3087Article in journal (Refereed) Published
Abstract [en]

Glottal adduction is a crucial aspect in voice education and vocal performance: it has major effects on phonatory airflow and, consequently, on voice timbre. As the voice is a non-visible musical instrument, controlling it could be facilitated by providing real-time visual feedback of phonatory airflow. Here, we test the usefulness of a flow ball (FB) training device, visualizing, in terms of the height of a polystyrene ball placed in a plastic basket, phonatory airflow during phonation. Audio and electroglottographic recordings of five postgraduate, classically trained singer students were made under three subsequent conditions: before, during, and after phonating into the FB. The calibrated audio signal was inverse-filtered, using an electroglottograph signal to guide the manual tuning of the inverse filters. Mean phonatory airflow, peak-to-peak pulse amplitude, and normalized amplitude quotient were extracted from the resulting flow glottograms. After the FB condition, increases of mean flow and peak-to-peak pulse amplitude were observed in four singers. In addition, the singers’ mean normalized amplitude quotient increased significantly. The findings, although exploratory, suggest that reduction of glottal adduction can be observed immediately after FB phonation. 

Place, publisher, year, edition, pages
SAGE Publications, 2021
Keywords
classical singing, flow phonation, glottal adduction, phonatory airflow, real-time visual feedback, voice training
National Category
Otorhinolaryngology Music
Identifiers
urn:nbn:se:kth:diva-310388 (URN)10.1177/03057356211026735 (DOI)000673661100001 ()2-s2.0-85110038944 (Scopus ID)
Note

QC 20220404

Available from: 2022-04-04 Created: 2022-04-04 Last updated: 2022-06-25Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-7234-7551

Search in DiVA

Show all publications