Change search
Link to record
Permanent link

Direct link
BETA
Sundberg, Johan
Publications (10 of 51) Show all publications
Sundberg, J., Salomão, G. L. & Scherer, K. R. (2019). Analyzing Emotion Expression in Singing via Flow Glottograms, Long-Term-Average Spectra, and Expert Listener Evaluation. Journal of Voice
Open this publication in new window or tab >>Analyzing Emotion Expression in Singing via Flow Glottograms, Long-Term-Average Spectra, and Expert Listener Evaluation
2019 (English)In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588Article in journal (Refereed) Published
Abstract [en]

Background: Acoustic aspects of emotional expressivity in speech have been analyzed extensively during recent decades. Emotional coloring is an important if not the most important property of sung performance, and therefore strictly controlled. Hence, emotional expressivity in singing may promote a deeper insight into vocal signaling of emotions. Furthermore, physiological voice source parameters can be assumed to facilitate the understanding of acoustical characteristics. Method: Three highly experienced professional male singers sang scales on the vowel /ae/ or /a/ in 10 emotional colors (Neutral, Sadness, Tender, Calm, Joy, Contempt, Fear, Pride, Love, Arousal, and Anger). Sixteen voice experts classified the scales in a forced-choice listening test, and the result was compared with long-term-average spectrum (LTAS) parameters and with voice source parameters, derived from flow glottograms (FLOGG) that were obtained from inverse filtering the audio signal. Results: On the basis of component analysis, the emotions could be grouped into four “families”, Anger-Contempt, Joy-Love-Pride, Calm-Tender-Neutral and Sad-Fear. Recognition of the intended emotion families by listeners reached accuracy levels far beyond chance level. For the LTAS and FLOGG parameters, vocal loudness had a paramount influence on all. Also after partialing out this factor, some significant correlations were found between FLOGG and LTAS parameters. These parameters could be sorted into groups that were associated with the emotion families. Conclusions: (i) Both LTAS and FLOGG parameters varied significantly with the enactment intentions of the singers. (ii) Some aspects of the voice source are reflected in LTAS parameters. (iii) LTAS parameters affect listener judgment of the enacted emotions and the accuracy of the intended emotional coloring.

Place, publisher, year, edition, pages
Mosby Inc., 2019
Keywords
Classical tradition, Emotion families, Enacting, Loudness, Parameter groups, adult, anger, article, controlled study, decision making, fear, filtration, human, male, sadness, singing, voice, vowel
National Category
Language Technology (Computational Linguistics) Music
Identifiers
urn:nbn:se:kth:diva-263251 (URN)10.1016/j.jvoice.2019.08.007 (DOI)2-s2.0-85072523482 (Scopus ID)
Note

QC 20191106

Available from: 2019-11-06 Created: 2019-11-06 Last updated: 2019-11-06Bibliographically approved
Sundberg, J., Laís Salomão, G. & Scherer, K. R. (2019). Analyzing Emotion Expression in Singing via Flow Glottograms, Long-Term-Average Spectra, and Expert Listener Evaluation. Journal of Voice
Open this publication in new window or tab >>Analyzing Emotion Expression in Singing via Flow Glottograms, Long-Term-Average Spectra, and Expert Listener Evaluation
2019 (English)In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588Article in journal (Refereed) Epub ahead of print
Abstract [en]

Acoustic aspects of emotional expressivity in speech have been analyzed extensively during recent decades. Emotional coloring is an important if not the most important property of sung performance, and therefore strictly controlled. Hence, emotional expressivity in singing may promote a deeper insight into vocal signaling of emotions. Furthermore, physiological voice source parameters can be assumed to facilitate the understanding of acoustical characteristics.

Method

Three highly experienced professional male singers sang scales on the vowel /ae/ or /a/ in 10 emotional colors (Neutral, Sadness, Tender, Calm, Joy, Contempt, Fear, Pride, Love, Arousal, and Anger). Sixteen voice experts classified the scales in a forced-choice listening test, and the result was compared with long-term-average spectrum (LTAS) parameters and with voice source parameters, derived from flow glottograms (FLOGG) that were obtained from inverse filtering the audio signal.

Results

On the basis of component analysis, the emotions could be grouped into four “families”, Anger-Contempt, Joy-Love-Pride, Calm-Tender-Neutral and Sad-Fear. Recognition of the intended emotion families by listeners reached accuracy levels far beyond chance level. For the LTAS and FLOGG parameters, vocal loudness had a paramount influence on all. Also after partialing out this factor, some significant correlations were found between FLOGG and LTAS parameters. These parameters could be sorted into groups that were associated with the emotion families.

Conclusions

(i) Both LTAS and FLOGG parameters varied significantly with the enactment intentions of the singers. (ii) Some aspects of the voice source are reflected in LTAS parameters. (iii) LTAS parameters affect listener judgment of the enacted emotions and the accuracy of the intended emotional coloring.

Place, publisher, year, edition, pages
Elsevier, 2019
Keywords
Enacting, Loudness, Emotion families, Parameter groups, Classical tradition
National Category
Other Engineering and Technologies
Identifiers
urn:nbn:se:kth:diva-259735 (URN)10.1016/j.jvoice.2019.08.007 (DOI)
Note

QC 20191118

Available from: 2019-09-22 Created: 2019-09-22 Last updated: 2019-11-18Bibliographically approved
Sundberg, J. (2018). Flow Glottogram and Subglottal Pressure Relationship in Singers and Untrained Voices. Journal of Voice, 32(1), 23-31
Open this publication in new window or tab >>Flow Glottogram and Subglottal Pressure Relationship in Singers and Untrained Voices
2018 (English)In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 32, no 1, p. 23-31Article in journal (Refereed) Published
Abstract [en]

This article combines results from three earlier investigations of the glottal voice source during phonation at varying degrees of vocal loudness (1) in five classically trained baritone singers (Sundberg et al., 1999), (2) in 15 female and 14 male untrained voices (Sundberg et al., 2005), and (3) in voices rated as hyperfunctional by an expert panel (Millgard et al., 2015). Voice source data were obtained by inverse filtering. Associated subglottal pressures were estimated from oral pressure during the occlusion for the consonant /p/. Five flow glottogram parameters, (1) maximum flow declination rate (MFDR), (2) peak-to-peak pulse amplitude, (3) level difference between the first and the second harmonics of the voice source, (4) closed quotient, and (5) normalized amplitude quotient, were averaged across the singer subjects and related to associated MFDR values. Strong, quantitative relations, expressed as equations, are found between subglottal pressure and MFDR and between MFDR and each of the other flow glottogram parameters. The values for the untrained voices, as well as those for the voices rated as hyperfunctional, deviate systematically from the values derived from the equations.

Place, publisher, year, edition, pages
MOSBY-ELSEVIER, 2018
Keywords
Inverse filter, Subglottal pressure, Flow glottogram, F0, Gender
National Category
Fluid Mechanics and Acoustics
Identifiers
urn:nbn:se:kth:diva-224050 (URN)10.1016/j.jvoice.2017.03.024 (DOI)000425917400004 ()28495328 (PubMedID)2-s2.0-85019022917 (Scopus ID)
Note

QC 20180316

Available from: 2018-03-16 Created: 2018-03-16 Last updated: 2018-03-16Bibliographically approved
Han, Q. & Sundberg, J. (2017). Duration, Pitch, and Loudness in Kunqu Opera Stage Speech. Journal of Voice, 31(2), Article ID UNSP 255.e1.
Open this publication in new window or tab >>Duration, Pitch, and Loudness in Kunqu Opera Stage Speech
2017 (English)In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 31, no 2, article id UNSP 255.e1Article in journal (Refereed) Published
Abstract [en]

Objectives. Kunqu is a special type of opera within the Chinese tradition with 600 years of history. In it, stage speech is used for the spoken dialogue. It is performed in Ming Dynasty's mandarin language and is a much more dominant part of the play than singing. Stage speech deviates considerably from normal conversational speech with respect to duration, loudness and pitch. This paper compares these properties in stage speech conversational speech. Method. A famous, highly experienced female singer's performed stage speech and reading of the same lyrics in a conversational speech mode. Clear differences are found. Results. As compared with conversational speech, stage speech had longer word and sentence duration and word duration was less variable. Average sound level was 16 dB higher. Also mean fundamental frequency was considerably higher and more varied. Within sentences, both loudness and fundamental frequency tended to vary according to a low-high-low pattern. Conclusions. Some of the findings fail to support current opinions regarding the characteristics of stage speech, and in this sense the study demonstrates the relevance of objective measurements in descriptions of vocal styles.

Place, publisher, year, edition, pages
MOSBY-ELSEVIER, 2017
Keywords
Kunqu opera, Stage speech, Conversational speech
National Category
General Language Studies and Linguistics Language Technology (Computational Linguistics)
Identifiers
urn:nbn:se:kth:diva-205514 (URN)10.1016/j.jvoice.2016.06.014 (DOI)000397918800056 ()27545077 (PubMedID)2-s2.0-84995975260 (Scopus ID)
Note

QC 20170509

Available from: 2017-05-10 Created: 2017-05-10 Last updated: 2018-01-13Bibliographically approved
Havel, M., Becker, S., Schuster, M., Johnson, T., Maier, A. & Sundberg, J. (2017). Effects of functional endoscopic sinus surgery on the acoustics of the sinonasal tract. Rhinology, 55(1), 81-89
Open this publication in new window or tab >>Effects of functional endoscopic sinus surgery on the acoustics of the sinonasal tract
Show others...
2017 (English)In: Rhinology, ISSN 0300-0729, E-ISSN 1996-8604, Vol. 55, no 1, p. 81-89Article in journal (Refereed) Published
Abstract [en]

Background: Nasal and paranasal cavities are supposed to contribute substantially to the vocal tract resonator properties. However, their acoustical effects as well as the effects of sinus surgery on the voice remain unclear. In this work we investigate resonance phenomena of paranasal sinuses prior to and after various rhinosurgical procedures in cadaveric human sinonasal tracts and corresponding 3D casts. Methodology: Nasal and paranasal cavities of formalin-preserved cadavers and corresponding 3D replicas were excited by sine tone sweeps from an earphone placed in the epipharynx.The response was picked up by a microphone at the nostrils. Different FESS procedures were performed and the acoustical responses following excitation were recorded.The measured acoustical changes in the obtained transfer functions were then evaluated. Results: Marked low frequency dips were detected in the transfer functions when sinus cavities were included in the nasal resonator system. These dips showed a significant correlation with sinus volumes. Following FESS procedures they moved upwards in frequency depending on the extent of the surgical intervention. Conclusions: The transfer functions obtained in cadaveric situs and 3D replicas showed dips at the resonance frequencies of the paranasal cavities. Marked acoustic effects in terms of increase in dip frequency following FESS procedures were reproducibly documented.

Place, publisher, year, edition, pages
INT RHINOLOGIC SOC, 2017
Keywords
functional endoscopic sinus surgery, paranasal sinuses, resonance frequency, sinonasal tract, cadaveric study, 3D replica
National Category
Otorhinolaryngology
Identifiers
urn:nbn:se:kth:diva-207911 (URN)10.4193/Rhino16.229 (DOI)000400724900012 ()28060384 (PubMedID)2-s2.0-85019240077 (Scopus ID)
Note

QC 20170530

Available from: 2017-05-30 Created: 2017-05-30 Last updated: 2018-06-18Bibliographically approved
Sundberg, J., Scherer, K. R., Trznadel, S. & Fantini, B. (2017). Recognizing emotions in the singing voice. Psychomusicology, 27(4), 244-255
Open this publication in new window or tab >>Recognizing emotions in the singing voice
2017 (English)In: Psychomusicology, ISSN 0275-3987, E-ISSN 2162-1535, Vol. 27, no 4, p. 244-255Article in journal (Refereed) Published
Abstract [en]

Although the human ability to recognize emotions in vocal speech utterances with reasonable accuracy has been well documented in numerous studies, little research has been reported on emotion recognition from emotional expression in the singing voice. This paper is the first to examine this issue by asking internationally known professional opera singers to portray 9 major emotions by singing sequences of nonsense syllables on the standard musical scale. We then asked more than 500 hundred listener/judges from different cultures with a wide range of musical preferences and degree of musical knowledge to recognize the intended emotions from the voice recordings. The data show that listeners are indeed able to recognize emotions expressed in singing with better-than-chance accuracy. In addition, we find some evidence that there seem to be only minor effects of culture or language on the ability to recognize the emotional interpretations. Some emotions are more easily recognized than others are. Overall, recognition ability from the singing voice compares well to accuracy rates in studies using speaking. Judges clearly use the differential acoustic patterns of sound generated by the singers in their performance to infer the emotion expressed, as demonstrated by comparing the recognition rates for different emotions to results of statistical classification based on acoustic parameters. We also attempt to explore the nature of the inference process by examining, using path models, the major acoustic variables involved and the inference from subjectively perceived configurations of voice quality.

National Category
Musicology Psychology Computer Sciences
Identifiers
urn:nbn:se:kth:diva-259542 (URN)10.1037/pmu0000193 (DOI)
Note

QC 20191009

Available from: 2019-09-17 Created: 2019-09-17 Last updated: 2019-10-09Bibliographically approved
Scherer, K. R., Sundberg, J., Fantini, B., Trznadel, S. & Eyben, F. (2017). The expression of emotion in the singing voice: Acoustic patterns in vocal performance. Journal of the Acoustical Society of America, 142(4), 1805-1815
Open this publication in new window or tab >>The expression of emotion in the singing voice: Acoustic patterns in vocal performance
Show others...
2017 (English)In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 142, no 4, p. 1805-1815Article in journal (Refereed) Published
Abstract [en]

There has been little research on the acoustic correlates of emotional expression in the singing voice. In this study, two pertinent questions are addressed: How does a singer's emotional interpretation of a musical piece affect acoustic parameters in the sung vocalizations? Are these patterns specific enough to allow statistical discrimination of the intended expressive targets? Eight professional opera singers were asked to sing the musical scale upwards and downwards (using meaningless content) to express different emotions, as if on stage. The studio recordings were acoustically analyzed with a standard set of parameters. The results show robust vocal signatures for the emotions studied. Overall, there is a major contrast between sadness and tenderness on the one hand, and anger, joy, and pride on the other. This is based on low vs high levels on the components of loudness, vocal dynamics, high perturbation variation, and a tendency for high low-frequency energy. This pattern can be explained by the high power and arousal characteristics of the emotions with high levels on these components. A multiple discriminant analysis yields classification accuracy greatly exceeding chance level, confirming the reliability of the acoustic patterns.

Place, publisher, year, edition, pages
Acoustical Society of America (ASA), 2017
Keywords
Perception, Vibrato, Speech, Communication, Music
National Category
Musicology
Identifiers
urn:nbn:se:kth:diva-223007 (URN)10.1121/1.5002886 (DOI)000413528900021 ()29092548 (PubMedID)2-s2.0-85031795162 (Scopus ID)
Funder
EU, FP7, Seventh Framework Programme, 230331-PROPEREMO
Note

QC 20180213

Available from: 2018-02-13 Created: 2018-02-13 Last updated: 2018-02-13Bibliographically approved
Havel, M. & Sundberg, J. (2013). Contribution of paranasal sinuses to the acoustical properties of the nasal tract. In: Proceedings and Report - 8th International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2013: . Paper presented at 8th International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2013; Firenze; Italy; 16 December 2013 through 18 December 2013 (pp. 47-50). Firenze University Press
Open this publication in new window or tab >>Contribution of paranasal sinuses to the acoustical properties of the nasal tract
2013 (English)In: Proceedings and Report - 8th International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2013, Firenze University Press , 2013, p. 47-50Conference paper, Published paper (Refereed)
Abstract [en]

The contribution of the nasal and paranasal cavities to vocal tract resonator properties is unclear. Here we investigate resonance phenomena of paranasal sinuses with and without selective occlusion of the middle meatus, and the sphenoidal as well as the maxillary ostium in a cadaveric situs. Nasal and paranasal cavities of the thiel-embalmed cadaver were excited by sine-tone sweeps from a earphone in the epipharynx. A microphone at the nostrils picked up the response. Different conditions with blocked and unblocked middle meatus and sphenoidal ostium were tested. Additionally, infundibulotomy was performed allowing direct access to and selective occlusion of the maxillary ostium. Response curves showed high reproducibility. A marked dip was observed after removing single sided occlusion of the middle meatus and the sphenoidal ostium. A marked low frequency dip was also detected after removal of occlusion of maxillary ostium following infundibulotomy. Reproducible frequency responses of nasal tract can be derived from cadaver measurements. Marked acoustic effects of the maxillary sinus appeared only after direct exposure of the maxillary ostium following infundibulotomy.

Place, publisher, year, edition, pages
Firenze University Press, 2013
Keywords
Paranasal sinuses, Resonance, Vocal tract, Voice
National Category
Otorhinolaryngology
Identifiers
urn:nbn:se:kth:diva-258190 (URN)2-s2.0-85070454370 (Scopus ID)9788866554691 (ISBN)
Conference
8th International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2013; Firenze; Italy; 16 December 2013 through 18 December 2013
Note

QC 20190910

Available from: 2019-09-10 Created: 2019-09-10 Last updated: 2019-09-10Bibliographically approved
Mathews, M. V., Friberg, A., Bennett, G., Sapp, C. & Sundberg, J. (2003). A marriage of the Director Musices program and the conductor program. In: Proceedings of the Stockholm Music Acoustics Conference, August 6-9, 2003 (SMAC 03), Stockholm, Sweden: . Paper presented at Music Acoustics Conference, August 6-9, 2003 (SMAC 03), Stockholm, Sweden (pp. 13-16). , 1
Open this publication in new window or tab >>A marriage of the Director Musices program and the conductor program
Show others...
2003 (English)In: Proceedings of the Stockholm Music Acoustics Conference, August 6-9, 2003 (SMAC 03), Stockholm, Sweden, 2003, Vol. 1, p. 13-16Conference paper, Published paper (Refereed)
Abstract [en]

This paper will describe an ongoing collaboration between the authors to combine the Director Musices and Conductor programs in order to achieve a more expressive and socially interactive performance of a midi file score by an electronic orchestra. Director Musices processes a “square” midi file, adjusting the dynamics and timing of the notes to achieve the expressive performance of a trained musician. The Conductor program and the Radio-baton allow a conductor, wielding an electronic baton, to follow and synchronize with other musicians, for example to provide an orchestral accompaniment to an operatic singer. These programs may be particularly useful for student soloists who wish to practice concertos with orchestral accompaniments. 

National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:kth:diva-234389 (URN)
Conference
Music Acoustics Conference, August 6-9, 2003 (SMAC 03), Stockholm, Sweden
Note

QC 20180913

Available from: 2018-09-06 Created: 2018-09-06 Last updated: 2018-09-13Bibliographically approved
Sundberg, J., Friberg, A., Mathews, M. V. & Bennett, G. (2001). Experiences of combining the radio baton with the director musices performance grammar. In: MOSART project workshop on current research directions in computer music: . Paper presented at MOSART project workshop on current research directions in computer music, Barcelona.
Open this publication in new window or tab >>Experiences of combining the radio baton with the director musices performance grammar
2001 (English)In: MOSART project workshop on current research directions in computer music, 2001Conference paper, Published paper (Refereed)
National Category
Computer and Information Sciences
Research subject
Speech and Music Communication
Identifiers
urn:nbn:se:kth:diva-234403 (URN)
Conference
MOSART project workshop on current research directions in computer music, Barcelona
Note

QC 20180910

Available from: 2018-09-06 Created: 2018-09-06 Last updated: 2018-09-12Bibliographically approved
Organisations

Search in DiVA

Show all publications