kth.sePublications
Change search
Refine search result
123 1 - 50 of 101
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1. Aronsson, Carina
    et al.
    Bohman, Mikael
    Ternström, Sten
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Södersten, Maria
    Loud voice during environmental noise exposure in patients with vocal nodules2007In: Logopedics, Phoniatrics, Vocology, ISSN 1401-5439, E-ISSN 1651-2022, Vol. 32, no 2, p. 60-70Article in journal (Refereed)
    Abstract [en]

    The aim was to investigate how female patients with vocal nodules use their voices when trying to make themselves heard over background noise. Ten patients with bilateral vocal fold nodules and 23 female controls were recorded reading a text in four conditions, one without noise and three with noise from cafes/pubs, played over loudspeakers at 69, 77 and 85 dBA. The noise was separated from the voice signal using a high-resolution channel estimation technique. Both patients and controls increased voice sound pressure level (SPL), fundamental frequency (F0), subglottal pressure (Ps) and their subjective ratings of strain significantly as a main effect of the increased background noise. The patients used significantly higher Ps in all four conditions. Despite this they did not differ significantly from the controls in voice SPL, F0 or perceived strain. It was concluded that speaking in background noise is a risk factor for vocal loading. Vocal loading tests in clinical settings are important and further development of assessment methods is needed.

  • 2.
    Bohman, Mikael
    et al.
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.
    Ternström, Sten
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Södersten, M.
    Karolinska University Hospital at Huddinge.
    The use of channel estimation techniques for investigating vocal stress in noisy environments2003In: Ultragarsas, ISSN 1392-2114, Vol. 3, no 48, p. 9-13Article in journal (Other academic)
  • 3.
    Bresin, Roberto
    et al.
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Askenfelt, Anders
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Friberg, Anders
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Hansen, Kjetil
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Ternström, Sten
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Sound and Music Computing at KTH2012In: Trita-TMH, ISSN 1104-5787, Vol. 52, no 1, p. 33-35Article in journal (Other academic)
    Abstract [en]

    The SMC Sound and Music Computing group at KTH (formerly the Music Acoustics group) is part of the Department of Speech Music and Hearing, School of Computer Science and Communication. In this short report we present the current status of the group mainly focusing on its research.

    Download full text (pdf)
    fulltext
  • 4.
    Cai, Huanchen
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Mapping Phonation Types by Clustering of Multiple Metrics2022In: Applied Sciences, ISSN 2076-3417, Vol. 12, no 23, p. 12092-Article in journal (Refereed)
    Abstract [en]

    For voice analysis, much work has been undertaken with a multitude of acoustic and electroglottographic metrics. However, few of these have proven to be robustly correlated with physical and physiological phenomena. In particular, all metrics are affected by the fundamental frequency and sound level, making voice assessment sensitive to the recording protocol. It was investigated whether combinations of metrics, acquired over voice maps rather than with individual sustained vowels, can offer a more functional and comprehensive interpretation. For this descriptive, retrospective study, 13 men, 13 women, and 22 children were instructed to phonate on /a/ over their full voice range. Six acoustic and EGG signal features were obtained for every phonatory cycle. An unsupervised voice classification model created feature clusters, which were then displayed on voice maps. It was found that the feature clusters may be readily interpreted in terms of phonation types. For example, the typical intense voice has a high peak EGG derivative, a relatively high contact quotient, low EGG cycle-rate entropy, and a high cepstral peak prominence in the voice signal, all represented by one cluster centroid that is mapped to a given color. In a transition region between the non-contacting and contacting of the vocal folds, the combination of metrics shows a low contact quotient and relatively high entropy, which can be mapped to a different color. Based on this data set, male phonation types could be clustered into up to six categories and female and child types into four. Combining acoustic and EGG metrics resolved more categories than either kind on their own. The inter- and intra-participant distributional features are discussed.

  • 5.
    D'Amario, Sara
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics. RITMO, University of Oslo.
    Ternström, StenKTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.Friberg, AndersKTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    SMAC 2023: Proceedings of the Stockholm Music Acoustics Conference 20232023Conference proceedings (editor) (Other academic)
    Abstract [en]

    This volume presents the proceedings of the fifth Stockholm Music Acoustics Conference 2023 (SMAC), which took place on 14–15 June 2023 in Stockholm, Sweden. SMAC was premiered at KTH in 1983, and has been organized every tenth year since then. This conference is intended for academics, music performers and instructors interested in the field of Music Acoustics. It brings together experts from different disciplines, to exchange and share their recent works on many aspects of Music Acoustics, including instrument acoustics, singing voice acoustics, acoustics-based synthesis models, music performance, and music acoustics in teaching and pedagogy.

    This time, our multidisciplinary conference was organized on a smaller scale than earlier, as a track within the 2023 Sound and Music Computing Conference, at KMH Royal College of Music and KTH Royal Institute of Technology. Our warm thanks are due to the SMC Network for hosting SMAC in the framework of SMC, as are many thanks to all presenters and co-authors for participating. We hope that you will enjoy learning of the new science presented here.

    Sara D’Amario, Sten Ternström and Anders Friberg

    Track chairs, Editors

    Download full text (pdf)
    fulltext
  • 6.
    D'Amario, Sara
    et al.
    Department of Music Acoustics, mdw – University of Music and Performing Arts Vienna, Vienna, Austria; RITMO Centre for Interdisciplinary Studies in Rhythm, Time and Motion, University of Oslo, Oslo, Norway; Department of Musicology, University of Oslo, Oslo, Norway.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Goebl, Werner
    Department of Music Acoustics, mdw – University of Music and Performing Arts Vienna, Vienna, Austria.
    Bishop, Laura
    RITMO Centre for Interdisciplinary Studies in Rhythm, Time and Motion, University of Oslo, Oslo, Norway; Department of Musicology, University of Oslo, Oslo, Norway.
    Body motion of choral singers2023In: Frontiers in Psychology, E-ISSN 1664-1078, Vol. 14Article in journal (Refereed)
    Abstract [en]

    Recent investigations on music performances have shown the relevance of singers’ body motion for pedagogical as well as performance purposes. However, little is known about how the perception of voice-matching or task complexity affects choristers’ body motion during ensemble singing. This study focussed on the body motion of choral singers who perform in duo along with a pre-recorded tune presented over a loudspeaker. Specifically, we examined the effects of the perception of voice-matching, operationalized in terms of sound spectral envelope, and task complexity on choristers’ body motion. Fifteen singers with advanced choral experience first manipulated the spectral components of a pre-recorded short tune composed for the study, by choosing the settings they felt most and least together with. Then, they performed the tune in unison (i.e., singing the same melody simultaneously) and in canon (i.e., singing the same melody but at a temporal delay) with the chosen filter settings. Motion data of the choristers’ upper body and audio of the repeated performances were collected and analyzed. Results show that the settings perceived as least together relate to extreme differences between the spectral components of the sound. The singers’ wrists and torso motion was more periodic, their upper body posture was more open, and their bodies were more distant from the music stand when singing in unison than in canon. These findings suggest that unison singing promotes an expressive-periodic motion of the upper body.

  • 7.
    D'Amario, Sara
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics. RITMO, University of Oslo.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Goebl, Werner
    mdw – University of Music and Performing Arts Vienna.
    Bishop, Laura
    University of Oslo, NO.
    Impact of singing togetherness and task complexity on choristers' body motion2023In: SMAC 2023: Proceedings of the Stockholm Music Acoustics Conference 2023 / [ed] D'Amario, S., Ternström, S., Friberg, A., Stockholm: KTH Royal Institute of Technology, 2023, p. 146-150Conference paper (Refereed)
    Abstract [en]

    We examined the impact of the perception of singing togetherness,as indexed by the spectral envelope of the sound, and task complexity on choristers’ body motion, as they performed in duo with a pre-recorded tune presented over a loudspeaker. Fifteen experienced choral singers first manipulated the spectral filter settings of the tune in order to identify the recordings they felt most and not at all together with. Then, they sang the tune in unison and canon along with the recordings featuring the chosen filter settings. Audio and motion capture data of the musicians' upper bodies during repeated performances of the same tune were collected. Results demonstrate that wrist motion was more periodic, singer posture more open, and the overall quantity of body motion higher when singing in unison than in canon; singing togetherness did not impact body motion. The current findings suggest that some body movements may support choral performance, depending on the complexity of the task condition.

  • 8.
    Degirmenci, Niyazi Cem
    et al.
    KTH, School of Computer Science and Communication (CSC), Computational Science and Technology (CST).
    Jansson, Johan
    KTH, School of Computer Science and Communication (CSC), Computational Science and Technology (CST).
    Hoffman, Johan
    KTH, School of Computer Science and Communication (CSC), Computational Science and Technology (CST).
    Arnela, Marc
    Sánchez-Martín, Patricia
    Guasch, Oriol
    Ternström, Sten
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.
    A Unified Numerical Simulation of Vowel Production That Comprises Phonation and the Emitted Sound2017In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2017, The International Speech Communication Association (ISCA), 2017, p. 3492-3496Conference paper (Refereed)
    Abstract [en]

    A unified approach for the numerical simulation of vowels is presented, which accounts for the self-oscillations of the vocal folds including contact, the generation of acoustic waves and their propagation through the vocal tract, and the sound emission outwards the mouth. A monolithic incompressible fluid-structure interaction model is used to simulate the interaction between the glottal jet and the vocal folds, whereas the contact model is addressed by means of a level set application of the Eikonal equation. The coupling with acoustics is done through an acoustic analogy stemming from a simplification of the acoustic perturbation equations. This coupling is one-way in the sense that there is no feedback from the acoustics to the flow and mechanical fields. All the involved equations are solved together at each time step and in a single computational run, using the finite element method (FEM). As an application, the production of vowel [i] has been addressed. Despite the complexity of all physical phenomena to be simulated simultaneously, which requires resorting to massively parallel computing, the formant locations of vowel [i] have been well recovered.

    Download full text (pdf)
    fulltext
  • 9.
    Ericsdotter, Christine
    et al.
    Stockholm University.
    Ternström, Sten
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Swedish2012In: The Use of the International Phonetic Alphabet in the Choral Rehearsal / [ed] Duane R. Karna, Scarecrow Press, 2012, p. 245-251Chapter in book (Other academic)
  • 10.
    Friberg, Anders
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Lindeberg, Tony
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST).
    Hellwagner, Martin
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Helgason, Pétur
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Salomão, Gláucia Laís
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Elowsson, Anders
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Lemaitre, Guillaume
    Institute for Research and Coordination in Acoustics and Music, Paris, France.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Prediction of three articulatory categories in vocal sound imitations using models for auditory receptive fields2018In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 144, no 3, p. 1467-1483Article in journal (Refereed)
    Abstract [en]

    Vocal sound imitations provide a new challenge for understanding the coupling between articulatory mechanisms and the resulting audio. In this study, we have modeled the classification of three articulatory categories, phonation, supraglottal myoelastic vibrations, and turbulence from audio recordings. Two data sets were assembled, consisting of different vocal imitations by four professional imitators and four non-professional speakers in two different experiments. The audio data were manually annotated by two experienced phoneticians using a detailed articulatory description scheme. A separate set of audio features was developed specifically for each category using both time-domain and spectral methods. For all time-frequency transformations, and for some secondary processing, the recently developed Auditory Receptive Fields Toolbox was used. Three different machine learning methods were applied for predicting the final articulatory categories. The result with the best generalization was found using an ensemble of multilayer perceptrons. The cross-validated classification accuracy was 96.8 % for phonation, 90.8 % for supraglottal myoelastic vibrations, and 89.0 % for turbulence using all the 84 developed features. A final feature reduction to 22 features yielded similar results.

    Download full text (pdf)
    fulltext
  • 11. Gramming, Patricia
    et al.
    Sundberg, Johan
    KTH, Superseded Departments (pre-2005), Speech Transmission and Music Acoustics.
    Ternström, Sten
    KTH, Superseded Departments (pre-2005), Speech Transmission and Music Acoustics.
    Leanderson, Rolf
    Perkins, William H.
    Relationship between changes in voice pitch and loudness1988In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 2, no 2, p. 118-126Article in journal (Refereed)
    Abstract [en]

    Summary Changes in mean fundamental frequency accompanying changes in loudness of phonation are analyzed in 9 professional singers, 9 nonsingers, and 10 male and 10 female patients suffering from vocal functional dysfunction. The subjects read discursive texts with noise in earphones, and some also at voluntarily varied vocal loudness. The healthy subjects phonated as softly and as loudly as possible at various fundamental frequencies throughout their pitch ranges, and the resulting mean phonetograms are compared. Mean pitch was found to increase by about half-semitones per decibel sound level. Grossly, the subject groups gave similar results, although the singers changed voice pitch more than the nonsingers. The voice pitch changes may be explained as passive results of changes of subglottal pressure required for the sound level variation.

  • 12.
    Grell, Anke
    et al.
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.
    Sundberg, Johan
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.
    Ternström, Sten
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Ptok, Martin
    Altenmueller, Eckart
    Rapid pitch correction in choir singers2009In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 126, no 1, p. 407-413Article in journal (Refereed)
    Abstract [en]

    Highly and moderately skilled choral singers listened to a perfect fifth reference, with the instruction to complement the fifth such that a major triad resulted. The fifth was suddenly and unexpectedly shifted in pitch, and the singers' task was to shift the fundamental frequency of the sung tone accordingly. The F0 curves during the transitions often showed two phases, an initial quick and large change followed by a slower and smaller change, apparently intended to fine-tune voice F0 to complement the fifth. Anesthetizing the vocal folds of moderately skilled singers tended to delay the reaction. The means of the response times varied in the range 197- 259 ms depending on direction and size of the pitch shifts, as well as on skill and anesthetization.

  • 13. Guasch, O.
    et al.
    Ternström, Sten
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Arnela, M.
    Alias, F.
    Unified numerical simulation of the physics of voice: The EUNISON project2013Conference paper (Other academic)
    Abstract [en]

    In this demo we will briefly outline the scope of the European EUNISON project, which aims at a unified numerical simulation of the physics of voice by resorting to supercomputer facilities, and present some of its preliminary results obtained to date.

  • 14.
    Guasch, Oriol
    et al.
    La Salle, Universitat Ramon Llull, Barcelona.
    Ternström, Sten
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Some current challenges in unified numerical simulations of voice production: from biomechanics to the emitted sound2017In: ISSP 2017 Proceedings, Tianjin, China: Institute of linguistics, CASS , 2017, p. 87-89, article id S2-2Conference paper (Other academic)
    Abstract [en]

    Voice production all the way from muscle activation to sound - are we there yet? Three-dimensional (3D) numerical simulations of the entire process of voice generation appear to be very challenging. Muscle activations position the articulators, which define a vocal tract geometry and posture the vocal folds. Air emanating from the lungs induces self-oscillations of the vocal folds, which result in aeroacoustic sources and the subsequent propagation of acoustic waves inside the vocal tract (VT). There, many things could happen. For instance, the air could resonate to generate vowels, or, at constrictions, airflow may be accelerated to create turbulent sounds such as fricatives. The vocal tract walls are flexible and react to the inner acoustic pressure. Also, articulators can change the vocal tract geometry to generate vowel-vowel utterances or syllables. Sound is finally radiated from the mouth.Attempting unified 3D numerical simulations of all the above processes, which involve coupling of a biomechanical model and the mechanical, fluid and acoustic fields, may seem unwise. Most research to date has addressed a few selected aspects of voice production. Unified approaches have been shunned for their daunting complexity and high-performance parallel computation requirements. This situation now seems to be changing. In this paper, we briefly review recent approaches towards 3D realistic voice simulation that unify, at least to some extent, some of the involved physical fields. Remaining challenges will be highlighted. We will focus on those works which end with the production of a given sound, thus leaving aside the huge amount of literature solely devoted to the complex simulation of phonation.

    Download full text (pdf)
    Full abstract
  • 15. Gustafsson, J.
    et al.
    Ternström, Sten
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.
    Södersten, M.
    Schalling, E.
    Motor-Learning-Based Adjustment of Ambulatory Feedback on Vocal Loudness for Patients With Parkinson's Disease2016In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 30, no 4, p. 407-415Article in journal (Refereed)
    Abstract [en]

    Objectives: To investigate how the direct biofeedback on vocal loudness administered with a portable voice accumulator (VoxLog) should be configured, to facilitate an optimal learning outcome for individuals with Parkinson's disease (PD), on the basis of principles of motor learning. Study Design: Methodologic development in an experimental study. Methods: The portable voice accumulator VoxLog was worn by 20 participants with PD during habitual speech during semistructured conversations. Six different biofeedback configurations were used, in random order, to study which configuration resulted in a feedback frequency closest to 20% as recommended on the basis of previous studies. Results: Activation of feedback when the wearer speaks below a threshold level of 3dB below the speaker's mean voice sound level in habitual speech combined with an activation time of 500ms resulted in a mean feedback frequency of 21.2%. Conclusions: Settings regarding threshold and activation time based on the results from this study are recommended to achieve an optimal learning outcome when administering biofeedback on vocal loudness for individuals with PD using portable voice accumulators.

  • 16.
    Gustafsson, Joakim Körner
    et al.
    Karolinska Institutet.
    Södersten, Maria
    Karolinska Institutet.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Speech Communication and Technology.
    Schalling, Ellika
    Karolinska Institutet.
    Voice use in daily life studied with a portable voice accumulator in individuals with Parkinson’s disease and matched healthy controls2019In: Journal of Speech, Language and Hearing Research, ISSN 1092-4388, E-ISSN 1558-9102, Vol. 62, no 12, p. 4324-4334Article in journal (Refereed)
    Abstract [en]

    Purpose: The purpose of this work was to study how voice use in daily life is impacted by Parkinson’s disease (PD), specifically if there is a difference in voice sound level and phonation ratio during everyday activities for individuals with PD and matched healthy controls. A further aim was to study how variations in environmental noise impact voice use. Method: Long-term registration of voice use during 1 week in daily life was performed for 21 participants with PD (11 male, 10 female) and 21 matched healthy controls using the portable voice accumulator VoxLog. Voice use was assessed through registrations of spontaneous speech in different ranges of environmental noise in daily life and in a controlled studio recording setting. Results: Individuals with PD use their voice 50%-60% less than their matched healthy controls in daily life. The difference increases in high levels of environmental noise. Individuals with PD used an average voice sound level in daily life that was 8.11 dB (female) and 6.7 dB (male) lower than their matched healthy controls. Difference in mean voice sound level for individuals with PD and controls during spontaneous speech during a controlled studio registration was 3.0 dB for the female group and 4.1 dB for the male group. Conclusions: The observed difference in voice use in daily life between individuals with PD and matched healthy controls is a 1st step to objectively quantify the impact of PD on communicative participation. The variations in voice use in different levels of environmental noise and when comparing controlled and variable environments support the idea that the study of voice use should include methods to assess function in less controlled situations outside the clinical setting.

  • 17. Herbst, Christian T.
    et al.
    Ternström, Sten
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Svec, Jan G.
    Investigation of four distinct glottal configurations in classical singing-A pilot study2009In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 125, no 3, p. EL104-EL109Article in journal (Refereed)
    Abstract [en]

    This study investigates four qualities of singing voice in a classically trained baritone: "naive falsetto," "countertenor falsetto," "lyrical chest" and "full chest." Laryngeal configuration and vocal fold behavior in these qualities were studied using laryngeal videostroboscopy, videokymography, electroglottography, and sound spectrography. The data suggest that the four voice qualities were produced by independently manipulating mainly two laryngeal parameters: (1) the adduction of the arytenoid cartilages and (2) the thickening of the vocal folds. An independent control of the posterior adductory muscles versus the vocalis muscle is considered to be the physiological basis for achieving these singing voice qualities.

  • 18. Herbst, Christian
    et al.
    Ternström, Sten
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    A comparison of different methods to measure the EGG contact quotient2006In: Logopedics, Phoniatrics, Vocology, ISSN 1401-5439, E-ISSN 1651-2022, Vol. 31, no 3, p. 126-138Article in journal (Refereed)
    Abstract [en]

    The results from six published electroglottographic (EGG-based) methods for calculating the EGG contact quotient (CQEGG) were compared to closed quotients derived from simultaneous videokymographic imaging (CQKYM). Two trained male singers phonated in falsetto and in chest register, with two degrees of adduction in both registers. The maximum difference between methods in the CQEGG was 0.3 (out of 1.0). The CQEGG was generally lower than the CQKYM. Within subjects, the CQEGG co-varied with the CQkym, but with changing offsets depending on method. The CQEGG cannot be calculated for falsetto phonation with little adduction, since there is no complete glottal closure. Basic criterion-level methods with thresholds of 0.2 or 0.25 gave the best match to the CQKYM data. The results suggest that contacting and de-contacting in the EGG might not refer to the same physical events as do the beginning and cessation of airflow.

  • 19. Jers, H.
    et al.
    Ternström, Sten
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Intonation analysis of a multi-channel choir recording2004In: Proc of Baltic-Nordic Acoustics Meeting 2004, B-NAM 2004, Mariehamn, Åland, 2004Conference paper (Other academic)
  • 20.
    Jers, Harald
    et al.
    Mannheim University of Music.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Vocal Ensembles: Chapter 202022In: The Oxford Handbook of Music Performance, Volume 2 / [ed] Gary E. McPherson, Oxford University Press , 2022, 1, p. 398-417Chapter in book (Refereed)
    Abstract [en]

    A typical performance situation of a vocal ensemble or choir consists of a group of singers in a room with listeners. The choir singers on stage interact while they sing, since they also hear the sound of the neighboring singer and react accordingly. From a physical point of view, the choir singers can be regarded as sound sources. The properties of the room influence the sound and the listeners perceive the sound event as a sound receiver. Furthermore, the processes in the choir can also be described acoustically, which affects the overall performance. The room influences the timbre of the sound on their way to the audience, the receiver. Reflection, absorption, diffraction, or refraction influence the timbre in the room. The sound in a performance space can be distinguished between a near field very close to the singer and a far field. The distance at which the far field can be assumed is strongly dependent on the acoustics of the room. Especially for singers within a choir, the differentiation between those sound fields is important for hearing oneself and the other singers. The position of the singers, their directivity, and the seating position of the listener in the audience will have an influence on listener perception. Furthermore, this chapter gives background information on intonation and synchronization aspects, which are most relevant for any vocal ensemble situation. Using this knowledge, intuitive behavior and performance practice can be explained and new adaptations can be suggested for singing in vocal ensembles.

  • 21. Kahlin, Daniel
    et al.
    Ternström, Sten
    The chorus effect revisited-experiments in frequency-domain analysis and simulation of ensemble sounds1999In: Conference Proceedings of the EUROMICRO, IEEE , 1999, Vol. 2, p. 75-80Conference paper (Refereed)
    Abstract [en]

    The so-called ’chorus’ or ’ensemble’ effect is interesting both musically and perceptually. It is usually imitated in effect devices using slowly varying time shifts, giving the impression of rotating speakers rather than of an ensemble. M. Dolson (1983) found that the quasi-random amplitude modulation of beating partials alone can cue the perception of ensemble. The small changes in frequency, he found, are less salient perceptually. This suggests an alternative simulation of the chorus effect. Attempts were made to corroborate Dolson’s finding, and to simulate ensembles in the frequency domain by modulating only partial tone amplitudes, using three approaches: filter banks, real FFTs and complex FFTs. The exact partial envelopes of a choral sound were found to be elusive, partly because the sidebands of one partial will overlap its neighbours at higher frequencies. The outcome of these trials is discussed and illustrated with sound examples. © 1999 IEEE.

  • 22.
    Kittimathaveenan, Kajornsak
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics. KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Speech Communication and Technology.
    Localisation in virtual choirs: outcomes of simplified binaural rendering2023Conference paper (Refereed)
    Abstract [en]

    A virtual choir would find several uses in choral pedagogy and research, but it would need a relatively small computational footprint for wide uptake. On the premise that very accurate localisation might not be needed for virtual rendering of the character of the sound inside an ensemble of singers, a localisation test was conducted using binaural stimuli created using a simplified approach, with parametrically controlled delays and variable low-pass filters (historically known as a ‘shuffler’ circuit) instead of head-related impulse responses. The direct sound from a monophonic anechoic recording of a soprano was processed (1) by sending it to a reverb algorithm for making a room-acoustic diffuse field with unchanging properties, (2) with a second-order low-pass filter with a cut-off frequency descending to 3 kHz for sources from behind, (3) with second-order low-pass head-shading filters with an angle-dependent cut-off frequency for the left/right lateral shadings of the head, and (4) with the gain of the direct sound being inversely proportional to virtual distance. The recorded singer was modelled as always facing the listener; no frequency-dependent directivity was implemented. Binaural stimuli corresponding to 24 different singer positions (8 angles and 3 distances) were synthesized. 30 participants heard the stimuli in randomized order, and indicated the perceived location of the singer on polar plot response sheets, with categories to indicate the possible responses. The listeners’ discrimination of the distance categories 0.5, 1 and 2 meters (1 correct out of 3 possible) was good, at about 80% correct. Discrimination of the angle of incidence, in 45-degreecategories (1 correct out of 8 possible) was fair, at 47% correct. Angle errors were mostly on the ‘cone of confusion’ (back-front symmetry), suggesting that the back-front cue was not very salient. The correct back-front responses (about 50%) dominated only somewhat over the incorrect ones (about 38%). In an ongoing follow-up study, multi-singer scenarios will be tested, and a more detailed yet still parametric filtering scheme will be explored.

    Download full text (pdf)
    fulltext
  • 23.
    Körner Gustafsson, Joakim
    et al.
    Karolinska Institutet.
    Södersten, Maria
    Ternström, Sten
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Schalling, Ellika
    Long-term effects of Lee Silverman Voice Treatment on daily voice use in Parkinson’s disease as measured with a portable voice accumulator2019In: Logopedics, Phoniatrics, Vocology, ISSN 1401-5439, E-ISSN 1651-2022, ISSN 1401-5439, Vol. 44, no 3, p. 124-133Article in journal (Refereed)
    Abstract [en]

    This study examines the effects of an intensive voice treatment focusing on increasing voice intensity, LSVT LOUD¯ Lee Silverman Voice Treatment, on voice use in daily life in a participant with Parkinson’s disease, using a portable voice accumulator, the VoxLog. A secondary aim was to compare voice use between the participant and a matched healthy control. Participants were an individual with Parkinson’s disease and his healthy monozygotic twin. Voice use was registered with the VoxLog during 9 weeks for the individual with Parkinson’s disease and 2 weeks for the control. This included baseline registrations for both participants, 4 weeks during LSVT LOUD for the individual with Parkinson’s disease and 1 week after treatment for both participants. For the participant with Parkinson’s disease, follow-up registrations at 3, 6, and 12 months post-treatment were made. The individual with Parkinson’s disease increased voice intensity during registrations in daily life with 4.1 dB post-treatment and 1.4 dB at 1-year follow-up compared to before treatment. When monitored during laboratory recordings an increase of 5.6 dB was seen post-treatment and 3.8 dB at 1-year follow-up. Changes in voice intensity were interpreted as a treatment effect as no significant correlations between changes in voice intensity and background noise were found for the individual with Parkinson’s disease. The increase in voice intensity in a laboratory setting was comparable to findings previously reported following LSVT LOUD. The increase registered using ambulatory monitoring in daily life was lower but still reflecting a clinically relevant change.

  • 24.
    Körner Gustafsson, Joakim
    et al.
    Karolinska Institutet.
    Södersten, Maria
    Karolinska Institutet.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Schalling, Ellika
    Karolinska Institutet.
    Treatment of Hypophonia in Parkinson’s Disease Through Biofeedback in Daily Life Administered with A Portable Voice Accumulator2021In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588Article in journal (Refereed)
    Abstract [en]

    Objectives

    The purpose of this study was to assess the outcome following continuous tactile biofeedback of voice sound level administered, with a portable voice accumulator to individuals with Parkinson's disease (PD).

    Method

    Nine out of 16 participants with PD completed a 4-week intervention program where biofeedback of voice sound level was administered with the portable voice accumulator VoxLog during speech in daily life. The feedback, a tactile vibration signal from the device, was activated when the wearer used a voice sound level below an individually predetermined threshold level, reminding the wearer to increase voice sound level during speech. Voice use was registered in daily life with the VoxLog during the intervention period as well as during one baseline week, one follow-up week post intervention and 1 week 3 months post intervention. Self-to-other ratio (SOR), which is the difference between voice sound level and environmental noise, was studied in multiple noise ranges.

    Results

    A significant increase in SOR across all noise ranges of 2.28 dB (SD: 0.55) was seen for participants with scores above the cut-off for normal function (>26 points) on the cognitive screening test Montreal Cognitive Assessment (MoCA) (n = 5). No significant increase was seen for the group of participants with MoCA scores below 26 (n = 4). Forty-four percent ended their participation early, all which scored below 26 on MoCA (n = 7).

    Conclusions

    Biofeedback administered in daily life regarding voice level may help individuals with PD to increase their voice sound level in relation to environmental noise in daily life, but only for a limited subset. Only participants with normal cognitive function as screened by MoCA improved their voice sound level in relation to environmental noise.

  • 25.
    Lamarche, Anick
    et al.
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Morsomme, Dominique
    Université Catholique de Louvain.
    Ternström, Sten
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Not Just Sound II: an Investigation of Singer patient Self-Perceptions Mapped into the Voice Range Profile2008In: Journal of Speech, Language and Hearing Research, ISSN 1092-4388, E-ISSN 1558-9102Article in journal (Other academic)
    Abstract [en]

    Purpose:  In aiming at higher specificity in clinical evaluations of the singing voice, singer perceptions were included and tested in conjunction with the voice range profile. Method:  The use of a commercial phonetograph supplemented by a hand-held response button was clinically tested with 13 subjects presenting voice complaints. Singer patients were asked to press a button to indicate sensations of vocal discomfort or instability during phonation. Each press was registered at the actual position in the Voice Range Profile (VRP) so as to mark areas of difficulty. Consistency of button press behavior was assessed with a method developed previously. Results:  In spite of their voice complaints, subjects did not press the button as much as healthy singers. Like healthy singers, the singer-patient group demonstrated consistent behavior but tended to press the button in completely different areas of the VRP space. The location of the presses was dominantly in the interior of the VRP and concentrated to a small fundamental frequency range.  An extensive discussion examines carefully the reasons for such outcomes. Conclusion:  The button augmented VRP could be a well needed resource for clinicians but requires further development and work.

  • 26.
    Lamarche, Anick
    et al.
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Ternström, Sten
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    An Exploration of Skin Acceleration Level as a Measureof Phonatory Function in Singing2008In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 22, no 1, p. 10-22Article in journal (Refereed)
    Abstract [en]

    Summary: Two kinds of fluctuations are observed in phonetogram recordingsof singing. Sound pressure level (SPL) can vary due to vibrato and also due tothe effect of open and closed vowels. Since vowel variation is mostly a consequence of vocal tract modification and is not directly related to phonatory function, it could be helpful to suppress such variation when studying phonation. Skin acceleration level (SAL), measured at the jugular notch and on the sternum, might be less influenced by effects of the vocal tract. It is explored in this study as an alternative measure to SPL. Five female singers sang vowel series on selected pitches and in different tasks. Recorded data were used to investigate two null hypotheses: (1) SPL and SAL are equally influenced by vowel variation and (2) SPL and SAL are equally correlated to subglottal pressure (PS). Interestingly, the vowel variation effect was small in both SPL and SAL. Furthermore, in comparison to SPL, SAL correlated weakly to PS. SAL exhibited practically no dependence on fundamental frequency, rather, its major determinant was the musical dynamic. This results in a non-sloping, square-like phonetogram contour. These outcomes show that SAL potentially can facilitate phonetographic analysis of the singing voice.

  • 27.
    Lamarche, Anick
    et al.
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Ternström, Sten
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Hertegård, Stellan
    Karolinska Institutet.
    Not just sound: Supplementing the voice range profile with the singer's ownperceptions of vocal challenges2009In: Logopedics, Phoniatrics, Vocology, ISSN 1401-5439, E-ISSN 1651-2022, Vol. 34, no 1, p. 3-10Article in journal (Refereed)
    Abstract [en]

    A commercial phonetograph was complemented with a response button, such that presses resulted in marked regions in the voice range profile (VRP). This study reports the VRP data of 16 healthy female professionally trained singers (7 mezzosopranos and 9 sopranos). Subjects pressed the button to indicate sensations of vocal instability or reduced control during phonation. Each press thereby marked potential areas of difficulty. A method is presented to quantify the consistency of button use for repeated tasks. The pattern of button presses was significantly consistent within subjects. As expected, the singers pressed at the extremes of VRP contours as well as at register transitions. These results and the potential of the method for the assessment of vocal problems of singers are discussed.

  • 28.
    Lamarche, Anick
    et al.
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Ternström, Sten
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Pabon, Peter
    Royal Conservatory, the Hague/University Utrecht/Voice Quality Systems.
    The Singer’s Voice Range Profile: Female Professional Opera Soloists2010In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 24, no 4, p. 410-426Article in journal (Refereed)
    Abstract [en]

    This work concerns the collection of 30 Voice Range Profiles (VRPs) of female operatic voice . Objectives: We address the questions: Is there a need for a singer’s protocol in VRP aquisition? Are physiological measurements sufficient or should the measurement of performance capabilities also be included? Can we address the female singing voice in general or is there a case for categorizing voices when studying phonetographic data? Method: Subjects performed a series of structured tasks involving both standard speech voice protocols and additional singing tasks. Singers also completed an extensive questionnaire. Results: Physiological VRPs differ from performance VRPs. Two new VRP metrics: the voice area above a defined level threshold, and the dynamic range independent from F0, were found to be useful in the analysis of singer VRP’s. Task design had no effect on performance VRP outcomes. Voice category differences were mainly attributable to phonation frequency based information. Conclusion: Results support the clinical importance of addressing the vocal instrument as it is used in performance. Equally important is the elaboration of a protocol suitable for the singing voice. The given context and instructions can be more important than task design for performance VRPs. Yet, for physiological VRP recordings, task design remains critical. Both types of VRPs are suggested for a singer’s voice evaluation.

  • 29. Lindström, Fredric
    et al.
    Waye, Kerstin Persson
    Sodersten, Maria
    McAllister, Anita
    Ternström, Sten
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Observations of the Relationship Between Noise Exposure and Preschool Teacher Voice Usage in Day-Care Center Environments2011In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 25, no 2, p. 166-172Article in journal (Refereed)
    Abstract [en]

    Although the relationship between noise exposure and vocal behavior (the Lombard effect) is well established, actual vocal behavior in the workplace is still relatively unexamined. The first purpose of this study was to investigate correlations between noise level and both voice level and voice average fundamental frequency (F-0) for a population of preschool teachers in their normal workplace. The second purpose was to study the vocal behavior of each teacher to investigate whether individual vocal behaviors or certain patterns could be identified. Voice and noise data were obtained for female preschool teachers (n = 13) in their workplace, using wearable measurement equipment. Correlations between noise level and voice level, and between voice level and F-0, were calculated for each participant and ranged from 0.07 to 0.87 for voice level and from 0.11 to 0.78 for F-0. The large spread of the correlation coefficients indicates that the teachers react individually to the noise exposure. For example, some teachers increase their voice-to-noise level ratio when the noise is reduced, whereas others do not.

  • 30.
    Lã, Filipa M.B.
    et al.
    University of Distance-Learning, MADRID, Spain.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Flow ball-assisted training: immediate effects on vocal fold contacting2019In: Pan-European Voice Conference 2019 / [ed] Jenny Iwarsson, Stine Løvind Thorsen, University of Copenhagen , 2019, p. 50-51Conference paper (Refereed)
    Abstract [en]

    Background: The flow ball is a device that creates a static backpressure in the vocal tract while providing real-time visual feedback of airflow. A ball height of 0 to 10 cm corresponds to airflows of 0.2 to 0.4. L/s. These high airflows with low transglottal pressure correspond to low flow resistances, similar to the ones obtained when phonating into straws with 3.7 mm diameter and of 2.8 cm length. Objectives: To investigate whether there are immediate effects of flow ball-assisted training on vocal fold contact. Methods: Ten singers (five males and five females) performed a messa di voce at different pitches over one octave in three different conditions: before, during and after phonating with a flow ball. For all conditions, both audio and electrolaryngographic (ELG) signals were simultaneously recorded using a Laryngograph microprocessor. The vocal fold contact quotient Qci (the area under the normalized EGG cycle) and dEGGmaxN (the normalized maximum rate of change of vocal fold contact area) were obtained for all EGG cycles, using the FonaDyn system. We introduce also a compound metric Ic ,the ‘index of contact’ [Qci × log10(dEGGmaxN)], with the properties that it goes to zero at no contact. It combines information from both Qci and dEGGmaxN and thus it is comparable across subjects. The intra-subject means of all three metrics were computed and visualized by colour-coding over the fo-SPL plane, in cells of 1 semitone × 1 dB. Results: Overall, the use of flow ball-assisted phonation had a small yet significant effect on overall vocal fold contact across the whole messa di voce exercise. Larger effects were evident locally, i.e., in parts of the voice range. Comparing the pre-post flow-ball conditions, there were differences in Qci and/or dEGGmaxN. These differences were generally larger in male than in female voices. Ic typically decreased after flow ball use, for males but not for females. Conclusion: Flow ball-assisted training seems to modify vocal fold contacting gestures, especially in male singers.

  • 31.
    Lã, Filipa M.B.
    et al.
    University of Distance-Learning, MADRID, Spain.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Flow ball-assisted voice training: Immediate effects on vocal fold contacting2020In: Biomedical Signal Processing and Control, ISSN 1746-8094, E-ISSN 1746-8108, Vol. 62, article id 102064Article in journal (Refereed)
    Abstract [en]

    Objective: Effects of exercises using a tool that promotes a semi-occluded artificially elongated vocal tract with real-time visual feedback of airflow – the flow ball – were tested using voice maps of EGG time-domain metrics. Methods: Ten classically trained singers (5 males and 5 females) were asked to sing messa di voce exercises on eight scale tones, performed in three consecutive conditions: baseline (‘before’), flow ball phonation (‘during’), and again without the flow ball (‘after’). These conditions were repeated eight times in a row: one scale tone at a time, on an ascending whole tone scale. Audio and electroglottographic signals were recorded using a Laryngograph microprocessor. Vocal fold contacting was assessed using three time-domain metrics of the EGG waveform, using FonaDyn. The quotient of contact by integration, Qci, the normalized peak derivative, QΔ, and the index of contacting Ic, were quantified and compared between ‘before’ and ‘after’ conditions. Results: Effects of flow ball exercises depended on singers’ habitual phonatory behaviours and on the position in the voice range. As computed over the entire range of the task, Qci was reduced by about 2% in five of ten singers. QΔ was 2–6% lower in six of the singers, and 3–4% higher only in the two bass-baritones. Ic decreased by almost 4% in all singers. Conclusion: Overall, vocal adduction was reduced and a gentler vocal fold collision was observed for the ‘after’ conditions. Significance: Flow ball exercises may contribute to the modification of phonatory behaviours of vocal pressedness.

  • 32. McAllister, Anita
    et al.
    Sederholm, E
    Ternström, Sten
    KTH, Superseded Departments (pre-2005), Speech, Music and Hearing.
    Sundberg, Johan
    KTH, Superseded Departments (pre-2005), Speech, Music and Hearing.
    Perturbation and hoarseness: a pilot study of six children's voices.1996In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 10, no 3Article in journal (Refereed)
    Abstract [en]

    Fundamental frequency (FO) perturbation has been found to be useful as an acoustic correlate of the perception of dysphonia in adult voices. In a previous investigation, we showed that hoarseness in children's voices is a stable concept composed mainly of three predictors: hyperfunction, breathiness, and roughness. In the present investigation, the relation between FO perturbation and hoarseness as well as its predictors was analyzed in running speech of six children representing different degrees of hoarseness. Two perturbation measures were used: the standard deviation of the distribution of perturbation data and the mean of the absolute value of perturbation. The results revealed no clear relation.

  • 33. Monson, B.
    et al.
    Lotto, A.
    Ternström, Sten
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Detection of high-frequency energy changes in sustained vowels produced by singers2011In: Journal of the Acoustical Society of America, ISSN 0001-4966, Vol. 129, no 4, p. 2263-2268Article in journal (Refereed)
    Abstract [en]

    The human voice spectrum above 5 kHz receives little attention. However, there are reasons to believe that this high-frequency energy (HFE) may play a role in perceived quality of voice in singing and speech. To fulfill this role, differences in HFE must first be detectable. To determine human ability to detect differences in HFE, the levels of the 8- and 16-kHz center-frequency octave bands were individually attenuated in sustained vowel sounds produced by singers and presented to listeners. Relatively small changes in HFE were in fact detectable, suggesting that this frequency range potentially contributes to the perception of especially the singing voice. Detection ability was greater in the 8-kHz octave than in the 16-kHz octave and varied with band energy level.

  • 34. Morris, R.J.
    et al.
    Ternström, Sten
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    LoVetri, J.
    Berkun, D.
    Long-Term Average Spectra From a Youth Choir Singing in Three Vocal Registers and Two Dynamic Levels2012In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 26, no 1, p. 30-36Article in journal (Refereed)
    Abstract [en]

    Objectives/HypothesisFew studies have reported the acoustic characteristics of youth choirs. In addition, scant data are available on youth choruses making the adjustments needed to sing at different dynamic levels in different registers. Therefore, the purpose of this study was to acoustically analyze the singing of a youth chorus to observe the evidence of the adjustments that they made to sing at two dynamic levels in three singing registers.Study DesignSingle-group observational study.MethodsThe participants were 47 members of the Brooklyn Youth Chorus who sang the same song sample in head, mixed, and chest voice at piano and forte dynamic levels. The song samples were recorded and analyzed using long-term average spectra and related spectral measures.ResultsThe spectra revealed different patterns among the registers. These differences imply that the singers were making glottal adjustments to sing the different register and dynamic level versions of the song. The duration of the closed phase, as estimated from the amplitudes of the first two harmonics, differed between the chest and head register singing at both dynamic levels. In addition, the spectral slopes differed among all three registers at both dynamic levels.ConclusionsThese choristers were able to change registers and dynamic levels quickly and with minimal prompting. Also, these acoustic measures may be a useful tool for evaluating some singing skills of young choristers.

  • 35. Murphy, D.
    et al.
    Shelley, S.
    Ternström, Sten
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Howard, D.
    The dynamically varying digital waveguide mesh2007In: Proceedings of the 19th International Congress on Acoustics / [ed] Calvo-Manzano, A. et al., 2007, p. 210-Conference paper (Other academic)
    Abstract [en]

    The digital waveguide mesh (DWM) is a multi-dimensional numerical simulation technique used to model vibrating objects capable of supporting acoustic wave propagation, with the result being sound output for excitation by a given stimulus. To date most DWM based simulations result in the static system impulse response for given initial and boundary value conditions. This method is often applied to room acoustics modelling problems, where the offline generation of impulse responses for computationally large or complex systems might be rendered in real-time using convolution based reverberation. More recently, work has explored how the DWM might be extended to allow dynamic variation and the possibility for real-time interactive sound synthesis. This paper introduces the basic DWM model and how it might be extended to include dynamic changes and user interaction as part of the simulation. Example applications that make use of this new dynamic DWM are explored including the synthesis of simple sound objects and the more complex problem of articulatory speech and singing synthesis based on a multi-dimensional simulation of the vocal tract.

  • 36. Murphy, Damian T.
    et al.
    Jani, Mátyás
    Ternström, Sten
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Articulatory vocal tract synthesis in Supercollider2015In: Proc. of the 18th Int. Conference on Digital Audio Effects (DAFx-15), Norwegian University of Science and Technology , 2015, p. 307-313Conference paper (Refereed)
    Abstract [en]

    The APEX system enables vocal tract articulation using a reduced set of user controllable parameters by means of Principal Component Analysis of X-ray tract data. From these articulatory profiles it is then possible to calculate cross-sectional area function data that can be used as input to a number of articulatory based speech synthesis algorithms. In this paper the Kelly-Lochbaum 1-D digital waveguide vocal tract is used, and both APEX control and synthesis engine have been implemented and tested in SuperCollider. Accurate formant synthesis and real-time control are demonstrated, although for multi-parameter speech-like articulation a more direct mapping from tract-to-synthesizer tube sections is needed. SuperCollider provides an excellent framework for the further exploration of this work.

    Download full text (pdf)
    fulltext
  • 37.
    Nilsonne, Åsa
    et al.
    Karolinska Institutet.
    Sundberg, Johan
    KTH, Superseded Departments (pre-2005), Speech Transmission and Music Acoustics.
    Ternström, Sten
    KTH, Superseded Departments (pre-2005), Speech Transmission and Music Acoustics.
    Askenfelt, Anders
    KTH, Superseded Departments (pre-2005), Speech Transmission and Music Acoustics.
    Measuring the rate of change of voice fundamental frequency in fluent speech during mental depression1988In: The Journal of the Acoustical Society of America, Vol. 83, no 2, p. 716-728Article in journal (Refereed)
    Abstract [en]

    A method of measuring the rate of change of fundamental frequency has been developed in an effort to find acoustic voice parameters that could be useful in psychiatric research. A minicomputer program was used to extract seven parameters from the fundamental frequency contour of tape‐recorded speech samples: (1) the average rate of change of the fundamental frequency and (2) its standard deviation, (3) the absolute rate of fundamental frequency change, (4) the total reading time, (5) the percent pause time of the total reading time, (6) the mean, and (7) the standard deviation of the fundamental frequency distribution. The method is demonstrated on (a) a material consisting of synthetic speech and (b) voice recordings of depressed patients who were examined during depression and after improvement.

  • 38.
    Nix, John
    et al.
    University of Texas at San Antonio.
    Jers, Harald
    Mannheim University of Music.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Acoustical, psychoacoustical, and pedagogical considerations for choral singing with covid-19 health measures2020In: Choral Journal, ISSN 0009-5028, Vol. 61, no 3, p. 32-40Article in journal (Refereed)
    Abstract [en]

    The COVID-19 pandemic has had a tremendous impact on many aspects of daily life. Accepted means for safely gathering persons for any activity include meeting outdoors if possible, maintaining 2 or more meters (6 feet) physical distance between persons, using high ventilation rates (preferably natural ventilation) to provide multiple air changes per hour if indoors, and wearing masks to prevent the spread of larger droplets. However, applying these health practices to choral singing has significant implications for the nature of the sound a choir creates, the perception of the choir’s sound both within and outside of the choir, and the vocal production of the singers. In this article, we hope to examine a few of these implications in more detail and to provide some suggestions for how best to respond, based on prior research in the acoustics and psychoacoustics of choral singing, stressing as always that observing necessary health measures is paramount.

  • 39.
    Nylén, Helmer
    et al.
    KTH, School of Engineering Sciences (SCI).
    Chatterjee, Saikat
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Information Science and Engineering.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Detecting Signal Corruptions in Voice Recordings For Speech Therapy2021In: ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Institute of Electrical and Electronics Engineers (IEEE) , 2021, p. 386-390Conference paper (Refereed)
  • 40.
    Pabon, Peter
    et al.
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Howard, David M.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Kob, Malte
    Eckel, Gerhard
    KTH, School of Computer Science and Communication (CSC), Media Technology and Interaction Design, MID.
    Future Perspectives2017In: Oxford Handbook of Singing / [ed] Welch, Graham; Howard, David M.; Nix, John, Oxford University Press, 2017, Vol. 1Chapter in book (Other academic)
    Abstract [en]

    This chapter, through examining several emerging or continuing areas of research, serves to look ahead at possible ways in which humans, with the help of technology, may interact with each other vocally as well as musically. Some of the topic areas, such the use of the Voice Range Profile, hearing modeling spectrography, voice synthesis, distance masterclasses, and virtual acoustics, have obvious pedagogical uses in the training of singers. Others, such as the use of 3D printed vocal tracts and computer music composition involving the voice, may lead to unique new ways in which singing may be used in musical performance. Each section of the chapter is written by an expert in the field who explains the technology in question and how it is used, often drawing upon recent research led by the chapter authors.

  • 41.
    Pabon, Peter
    et al.
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Stallinga, R.
    Södersten, M.
    Ternström, Sten
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Effects on Vocal Range and Voice Quality of Singing Voice Training: The Classically Trained Female Voice2014In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 28, no 1, p. 36-51Article in journal (Refereed)
    Abstract [en]

    ObjectivesA longitudinal study was performed on the acoustical effects of singing voice training under a given study programme, using the Voice Range Profile (VRP). Study DesignPre- and post-training recordings were made of students that participated in a 3-year bachelor singing study programme. A questionnaire that included questions on optimal range, register use, classification, vocal health and hygiene, mixing technique, and training goals, was used to rate and categorize self-assessed voice changes. Based on the responses, a sub-group of 10 classically trained female voices was selected, that was homogeneous enough for effects of training to be identified. MethodsThe VRP perimeter contour was analyzed for effects of voice training. Also, a mapping within the VRP of voice quality, as expressed by the crest factor, was used to indicate the register boundaries and to monitor the acoustical consequences of the newly learned vocal technique of ‘mixed voice.’ VRP’s were averaged across subjects. Findings were compared to the self-assessed vocal changes. ResultsPre-post comparison of the average VRPs showed, in the midrange, (1) a decrease in the VRP area that was associated with the loud chest voice, (2) a reduction of the crest factor values, and (3) a reduction of maximum SPL values. The students’ self-evaluations of the voice changes appeared in some cases to contradict the VRP findings. ConclusionsVRP’s of individual voices were seen to change over the course of a singing education. These changes were manifest also in the group average. High resolution computerized recording, complemented with an acoustic register marker, allows a meaningful assessment of some effects of training, on an individual basis as well as for groups comprised of singers of a specific genre. It is argued that this kind of investigation is possible only within a focussed training programme, given by a faculty that has agreed on the goals.

  • 42.
    Pabon, Peter
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics. Royal Conservatoire, The Hague, Netherlands.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Feature maps of the acoustic spectrum of the voice2020In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 34, no 1, p. 161.e1-161.e26Article in journal (Refereed)
    Abstract [en]

    The change in the spectrum of sustained /a/ vowels was mapped over the voice range from low to high fundamental frequency and low to high sound pressure level (SPL), in the form of the so-called voice range profile (VRP). In each interval of one semitone and one decibel, narrowband spectra were averaged both within and across subjects. The subjects were groups of 7 male and 12 female singing students, as well as a group of 16 untrained female voices. For each individual and also for each group, pairs of VRP recordings were made, with stringent separation of the modal/chest and falsetto/head registers. Maps are presented of eight scalar metrics, each of which was chosen to quantify a particular feature of the voice spectrum, over fundamental frequency and SPL. Metrics 1 and 2 chart the role of the fundamental in relation to the rest of the spectrum. Metrics 3 and 4 are used to explore the role of resonances in relation to SPL. Metrics 5 and 6 address the distribution of high frequency energy, while metrics 7 and 8 seek to describe the distribution of energy at the low end of the voice spectrum.

    Several examples are observed of phenomena that are difficult to predict from linear source-filter theory, and of the voice source being less uniform over the voice range than is conventionally assumed. These include a high-frequency band-limiting at high SPL and an unexpected persistence of the second harmonic at low SPL. The two voice registers give rise to clearly different maps. Only a few effects of training were observed, in the low frequency end below 2 kHz. The results are of potential interest in voice analysis, voice synthesis and for new insights into the voice production mechanism.

  • 43.
    Pabon, Peter
    et al.
    Royal Conservatoire, The Hague, NL.
    Ternström, Sten
    KTH, School of Computer Science and Communication (CSC).
    Lamarche, Anick
    KTH, School of Computer Science and Communication (CSC).
    Fourier Descriptor Analysis and Unification of Voice Range Profile Contours: Method and Applications2011In: Journal of Speech, Language and Hearing Research, ISSN 1092-4388, E-ISSN 1558-9102, Vol. 54, no 3, p. 755-776Article in journal (Refereed)
    Abstract [en]

    Purpose: To describe a method for unified description, statistical modeling, and comparison of voice range profile (VRP) contours, even from diverse sources. Method: A morphologic modeling technique, which is based on Fourier descriptors (FDs), is applied to the VRP contour. The technique, which essentially involves resampling of the curve of the contour, is assessed and also is compared to density-based VRP averaging methods that use the overlap count. Results: VRP contours can be usefully described and compared using FDs. The method also permits the visualization of the local covariation along the contour average. For example, the FD-based analysis shows that the population variance for ensembles of VRP contours is usually smallest at the upper left part of the VRP. To illustrate the method's advantages and possible further application, graphs are given that compare the averaged contours from different authors and recording devices-for normal, trained, and untrained male and female voices as well as for child voices. Conclusions: The proposed technique allows any VRP shape to be brought to the same uniform base. On this uniform base, VRP contours or contour elements coming from a variety of sources may be placed within the same graph for comparison and for statistical analysis.

  • 44.
    Patel, Rita R.
    et al.
    Indiana University, Bloomington, IN, USA.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Quantitative and Qualitative Electroglottographic Wave Shape Differences in Children and Adults Using Voice Map-Based Analysis2021In: Journal of Speech, Language and Hearing Research, ISSN 1092-4388, E-ISSN 1558-9102, ISSN 1092-4388, Vol. 64, no 8, p. 2977-2995Article in journal (Refereed)
    Abstract [en]

    Purpose: The purpose of this study is to identify the extent to which various measurements of contacting parameters differ between children and adults during habitual range and overlap vocal frequency/intensity, using voice map–based assessment of noninvasive electroglottography(EGG).

    Method: EGG voice maps were analyzed from 26 adults(22–45 years) and 22 children (4–8 years) during connected speech and vowel /a/ over the habitual range and the overlap vocal frequency/intensity from the voice range profile task on the vowel /a/. Mean and standard deviations of contact quotient by integration, normalized contacting speed, quotient of speed by integration, and cycle-rate sample entropy were obtained. Group differences were evaluated using the linear mixed model analysis for the habitual range connected speech and the vowel, whereas analysis of covariance was conducted for the overlap vocal frequency/intensity from the voice range profile task. Presence of a “knee” on the EGG wave shape was determined by visual inspection of the presence of convexity along the decontacting slope of the EGG pulse and the presence of the second derivative zero-crossing.

    Results: The contact quotient by integration, normalized contacting speed, quotient of speed by integration, and cycle-rate sample entropy were significantly different in children compared to (a) adult males for habitual range and(b) adult males and adult females for the overlap vocal frequency/intensity. None of the children had a “knee” on the decontacting slope of the EGG slope.

    Conclusion: EGG parameters of contact quotient by integration, normalized contacting speed, quotient of speed by integration, cycle-rate sample entropy, and absence of a “knee” on the decontacting slope characterize the waveshape differences between children and adults, whereas the normalized contacting speed, quotient of speed by integration, cycle-rate sample entropy, and presence of a “knee” on the downward pulse slope characterize the waveshape differences between adult males and adult females.

    Supplemental Material: https://doi.org/10.23641/asha.15057345

  • 45.
    Patel, Rita
    et al.
    University of Indiana, USA.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Electroglottographic voice maps of untrained vocally healthy adults with gender differences and gradients2019In: Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA): 11th International Workshop / [ed] Manfredi, C., Firenze, Italy: Firenze University Press, 2019, p. 107-110Conference paper (Refereed)
    Abstract [en]

    Baseline data from adult speakers are presented for time-domain parameters of EGG waveforms. 26 vocally healthy adults (13 males and 13 females) were recruited for the study. Four dependent variables were computed: mean contact quotient, mean peak-rate-of change in the contact area, index of contacting, and the audio crest factor. Small regions around the speech range distribution modes on the fo/SPL plane were used to define means and gradients. Males and females differed considerably in the audio crest factor of their speaking voice, and somewhat in their EGG contact quotient when measured at the mode point of the individual speech range profile. In males, contacting tended to increase somewhat with fo and SPL around the mode point, while in females it tended to decrease.

  • 46.
    Patel, Rita
    et al.
    University of Indiana.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Non-invasive evaluation of vibratory kinematics of phonation in children2020In: 12th International Conference on Voice Physiology and Biomechanics / [ed] Henrich Bernardoni, N.; Bailly, L., Grenoble, 2020, p. 28-Conference paper (Refereed)
    Abstract [en]

    Developmentally, children do not have five well-defined layers of the vocal folds as adults. Also, children are often less able to tolerate the invasiveness of endoscopy. Electroglottography (EGG) is not invasive, but little is known of how the EGGs of children differs from those of adults. The goal of this study was to quantify some differences in the shape of the EGG waveform between children and adults, while accounting for the sensitivity to variation in the independent variables fo and SPL. A novel mapping of EGG waveform parameters over the speech and voice ranges was employed

  • 47. Reid, Katherine L. P.
    et al.
    Davis, Pamela
    Oates, Jennifer
    Cabrera, Densil
    Ternström, Sten
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Black, Michael
    Chapman, Janice
    The acoustic characteristics of professional opera singers performing in chorus versus solo mode2007In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 21, no 1, p. 35-45Article in journal (Refereed)
    Abstract [en]

    In this study, members of a professional opera chorus were recorded using close microphones, while singing in both choral and solo modes. The analysis included computation of long-term average spectra (LTAS) for the two song sections performed and calculation of singing power ratio (SPR) and energy ratio (ER), which provide an indication of the relative energy in the singer's formant region. Vibrato rate and extent were determined from two matched vowels, and SPR and ER were calculated for these vowels. Subjects sang with equal or more power in the singer's formant region in choral versus solo mode in the context of the piece as a whole and in individual vowels. There was no difference in vibrato rate and extent between the two modes. Singing in choral mode, therefore, required the ability to use a similar vocal timbre to that required for solo opera singing.

  • 48. Rocchesso, D.
    et al.
    Lemaitre, G.
    Susini, P.
    Ternström, Sten
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.
    Boussard, P.
    Sketching sound with voice and gesture2015In: interactions, ISSN 1072-5520, E-ISSN 1558-3449, Vol. 22, no 1, p. 38-41Article in journal (Refereed)
    Abstract [en]

    Voice and gestures are natural sketching tools that can be exploited to communicate sonic interactions. In product and interaction design, sounds should be included in the early stages of the design process. Scientists of human motion have shown that auditory stimuli are important in the performance of difficult tasks and can elicit anticipatory postural adjustments in athletes. These findings justify the attention given to sound in interaction design for gaming, especially in action and sports games that afford the development of levels of virtuosity. The sonic manifestations of objects can be designed by acting on their mechanical qualities and by augmenting the objects with synthetic and responsive sounds.

  • 49. Rossing, T D
    et al.
    Sundberg, Johan
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Ternström, Sten
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Acoustic comparison of soprano solo and choir singing.1987In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 82, no 3, p. 830-836Article in journal (Refereed)
    Abstract [en]

    Five soprano singers were recorded while singing similar texts in both choir and solo modes of performance. A comparison of long-term-average spectra of similar passages in both modes indicates that subjects used different tactics to achieve somewhat higher concentrations of energy in the 2- to 4-kHz range when singing in the solo mode. It is likely that this effect resulted, at least in part, from a slight change of the voice source from choir to solo singing. The subjects used slightly more vibrato when singing in the solo mode.

  • 50. Rossing, T D
    et al.
    Sundberg, Johan
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Ternström, Sten
    Acoustic comparison of voice use in solo and choir singing.1986In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 79, no 6, p. 1975-1981Article in journal (Refereed)
    Abstract [en]

    An experiment was carried out in which eight bass/baritone singers were recorded while singing in both choral and solo modes. Together with their own voice, they heard the sound of the rest of the choir and a piano accompaniment, respectively. The recordings were analyzed in several ways, including computation of long-time-average spectra for each passage, analysis of the sound levels in the frequency ranges corresponding to the fundamental and the "singer's formant," and a comparison of the sung levels with the levels heard by the singers. Matching pairs of vowels in the two modes were inverse filtered to determine the voice source spectra and formant frequencies for comparison. Differences in both phonation and articulation between the two modes were observed. Subjects generally sang with more power in the singer's formant region in the solo mode and with more power in the fundamental region in the choral mode. Most singers used a reduced frequency distance between the third and fifth formants for increasing the power in the singer's formant range, while the difference in the fundamental was mostly a voice source effect. In a choral singing mode, subjects usually adjusted their voice levels to the levels they heard from the other singers, whereas in a solo singing mode the level sung depended much less on the level of an accompaniment.

123 1 - 50 of 101
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf