kth.sePublications
Change search
Refine search result
1 - 23 of 23
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    D'Amario, Sara
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics. RITMO, University of Oslo.
    Ternström, StenKTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.Friberg, AndersKTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    SMAC 2023: Proceedings of the Stockholm Music Acoustics Conference 20232023Conference proceedings (editor) (Other academic)
    Abstract [en]

    This volume presents the proceedings of the fifth Stockholm Music Acoustics Conference 2023 (SMAC), which took place on 14–15 June 2023 in Stockholm, Sweden. SMAC was premiered at KTH in 1983, and has been organized every tenth year since then. This conference is intended for academics, music performers and instructors interested in the field of Music Acoustics. It brings together experts from different disciplines, to exchange and share their recent works on many aspects of Music Acoustics, including instrument acoustics, singing voice acoustics, acoustics-based synthesis models, music performance, and music acoustics in teaching and pedagogy.

    This time, our multidisciplinary conference was organized on a smaller scale than earlier, as a track within the 2023 Sound and Music Computing Conference, at KMH Royal College of Music and KTH Royal Institute of Technology. Our warm thanks are due to the SMC Network for hosting SMAC in the framework of SMC, as are many thanks to all presenters and co-authors for participating. We hope that you will enjoy learning of the new science presented here.

    Sara D’Amario, Sten Ternström and Anders Friberg

    Track chairs, Editors

    Download full text (pdf)
    fulltext
  • 2.
    D'Amario, Sara
    et al.
    Department of Music Acoustics, mdw – University of Music and Performing Arts Vienna, Vienna, Austria; RITMO Centre for Interdisciplinary Studies in Rhythm, Time and Motion, University of Oslo, Oslo, Norway; Department of Musicology, University of Oslo, Oslo, Norway.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Goebl, Werner
    Department of Music Acoustics, mdw – University of Music and Performing Arts Vienna, Vienna, Austria.
    Bishop, Laura
    RITMO Centre for Interdisciplinary Studies in Rhythm, Time and Motion, University of Oslo, Oslo, Norway; Department of Musicology, University of Oslo, Oslo, Norway.
    Body motion of choral singers2023In: Frontiers in Psychology, E-ISSN 1664-1078, Vol. 14Article in journal (Refereed)
    Abstract [en]

    Recent investigations on music performances have shown the relevance of singers’ body motion for pedagogical as well as performance purposes. However, little is known about how the perception of voice-matching or task complexity affects choristers’ body motion during ensemble singing. This study focussed on the body motion of choral singers who perform in duo along with a pre-recorded tune presented over a loudspeaker. Specifically, we examined the effects of the perception of voice-matching, operationalized in terms of sound spectral envelope, and task complexity on choristers’ body motion. Fifteen singers with advanced choral experience first manipulated the spectral components of a pre-recorded short tune composed for the study, by choosing the settings they felt most and least together with. Then, they performed the tune in unison (i.e., singing the same melody simultaneously) and in canon (i.e., singing the same melody but at a temporal delay) with the chosen filter settings. Motion data of the choristers’ upper body and audio of the repeated performances were collected and analyzed. Results show that the settings perceived as least together relate to extreme differences between the spectral components of the sound. The singers’ wrists and torso motion was more periodic, their upper body posture was more open, and their bodies were more distant from the music stand when singing in unison than in canon. These findings suggest that unison singing promotes an expressive-periodic motion of the upper body.

  • 3.
    D'Amario, Sara
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics. RITMO, University of Oslo.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Goebl, Werner
    mdw – University of Music and Performing Arts Vienna.
    Bishop, Laura
    University of Oslo, NO.
    Impact of singing togetherness and task complexity on choristers' body motion2023In: SMAC 2023: Proceedings of the Stockholm Music Acoustics Conference 2023 / [ed] D'Amario, S., Ternström, S., Friberg, A., Stockholm: KTH Royal Institute of Technology, 2023, p. 146-150Conference paper (Refereed)
    Abstract [en]

    We examined the impact of the perception of singing togetherness,as indexed by the spectral envelope of the sound, and task complexity on choristers’ body motion, as they performed in duo with a pre-recorded tune presented over a loudspeaker. Fifteen experienced choral singers first manipulated the spectral filter settings of the tune in order to identify the recordings they felt most and not at all together with. Then, they sang the tune in unison and canon along with the recordings featuring the chosen filter settings. Audio and motion capture data of the musicians' upper bodies during repeated performances of the same tune were collected. Results demonstrate that wrist motion was more periodic, singer posture more open, and the overall quantity of body motion higher when singing in unison than in canon; singing togetherness did not impact body motion. The current findings suggest that some body movements may support choral performance, depending on the complexity of the task condition.

  • 4.
    Elowsson, Anders
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Modeling Music: Studies of Music Transcription, Music Perception and Music Production2018Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    This dissertation presents ten studies focusing on three important subfields of music information retrieval (MIR): music transcription (Part A), music perception (Part B), and music production (Part C).

    In Part A, systems capable of transcribing rhythm and polyphonic pitch are described. The first two publications present methods for tempo estimation and beat tracking. A method is developed for computing the most salient periodicity (the “cepstroid”), and the computed cepstroid is used to guide the machine learning processing. The polyphonic pitch tracking system uses novel pitch-invariant and tone-shift-invariant processing techniques. Furthermore, the neural flux is introduced – a latent feature for onset and offset detection. The transcription systems use a layered learning technique with separate intermediate networks of varying depth.  Important music concepts are used as intermediate targets to create a processing chain with high generalization. State-of-the-art performance is reported for all tasks.

    Part B is devoted to perceptual features of music, which can be used as intermediate targets or as parameters for exploring fundamental music perception mechanisms. Systems are proposed that can predict the perceived speed and performed dynamics of an audio file with high accuracy, using the average ratings from around 20 listeners as ground truths. In Part C, aspects related to music production are explored. The first paper analyzes long-term average spectrum (LTAS) in popular music. A compact equation is derived to describe the mean LTAS of a large dataset, and the variation is visualized. Further analysis shows that the level of the percussion is an important factor for LTAS. The second paper examines songwriting and composition through the development of an algorithmic composer of popular music. Various factors relevant for writing good compositions are encoded, and a listening test employed that shows the validity of the proposed methods.

    The dissertation is concluded by Part D - Looking Back and Ahead, which acts as a discussion and provides a road-map for future work. The first paper discusses the deep layered learning (DLL) technique, outlining concepts and pointing out a direction for future MIR implementations. It is suggested that DLL can help generalization by enforcing the validity of intermediate representations, and by letting the inferred representations establish disentangled structures supporting high-level invariant processing. The second paper proposes an architecture for tempo-invariant processing of rhythm with convolutional neural networks. Log-frequency representations of rhythm-related activations are suggested at the main stage of processing. Methods relying on magnitude, relative phase, and raw phase information are described for a wide variety of rhythm processing tasks.

    Download full text (pdf)
    Introduction and Summary of Dissertation: Modeling Music - Anders Elowsson
  • 5.
    Elowsson, Anders
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Polyphonic Pitch Tracking with Deep Layered LearningManuscript (preprint) (Other academic)
    Abstract [en]

    This paper presents a polyphonic pitch tracking system able to extract both framewise and note-based estimates from audio. The system uses six artificial neural networks in a deep layered learning setup. First, cascading networks are applied to a spectrogram for framewise fundamental frequency (f0) estimation. A sparse receptive field is learned by the first network and then used for weight-sharing throughout the system. The f0 activations are connected across time to extract pitch ridges. These ridges define a framework, within which subsequent networks perform tone-shift-invariant onset and offset detection. The networks convolve the pitch ridges across time, using as input, e.g., variations of latent representations from the f0 estimation networks, defined as the “neural flux.” Finally, incorrect tentative notes are removed one by one in an iterative procedure that allows a network to classify notes within an accurate context. The system was evaluated on four public test sets: MAPS, Bach10, TRIOS, and the MIREX Woodwind quintet, and performed state-of-the-art results for all four datasets. It performs well across all subtasks: f0, pitched onset, and pitched offset tracking.

  • 6.
    Elowsson, Anders
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Tempo-Invariant Processing of Rhythm with Convolutional Neural NetworksManuscript (preprint) (Other academic)
    Abstract [en]

    Rhythm patterns can be performed with a wide variation of tempi. This presents a challenge for many music information retrieval (MIR) systems; ideally, perceptually similar rhythms should be represented and processed similarly, regardless of the specific tempo at which they were performed. Several recent systems for tempo estimation, beat tracking, and downbeat tracking have therefore sought to process rhythm in a tempo-invariant way, often by sampling input vectors according to a precomputed pulse level. This paper describes how a log-frequency representation of rhythm-related activations instead can promote tempo invariance when processed with convolutional neural networks. The strategy incorporates invariance at a fundamental level and can be useful for most tasks related to rhythm processing. Different methods are described, relying on magnitude, phase relationships of different rhythm channels, as well as raw phase information. Several variations are explored to provide direction for future implementations.

  • 7.
    Huang, Rujing Stacy
    et al.
    University of Hong Kong.
    Holzapfel, Andre
    KTH, School of Electrical Engineering and Computer Science (EECS), Human Centered Technology, Media Technology and Interaction Design, MID.
    Sturm, Bob
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Global Ethics: From Philosophy to Practice A Culturally Informed Ethics of Music AI in Asia2022In: Artificial Intelligence and Music Ecosystem / [ed] Martin Clancy, Routledge, 2022, p. 126-141Chapter in book (Refereed)
    Download full text (pdf)
    fulltext
  • 8.
    Iob, Naomi Anna
    et al.
    University Hospital Zurich.
    He, Lei
    University Hospital Zurich.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Cai, Huanchen
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Brockmann-Bauser, Meike
    University Hospital Zurich.
    Effects of Speech Characteristics on Electroglottographic and Instrumental Acoustic Voice Analysis Metrics in Women With Structural Dysphonia Before and After Treatment2024In: Journal of Speech, Language and Hearing Research, ISSN 1092-4388, E-ISSN 1558-9102, p. 1-22Article in journal (Refereed)
    Abstract [en]

    Purpose: Literature suggests a dependency of the acoustic metrics, smoothed cepstral peak prominence (CPPS) and harmonics-to-noise ratio (HNR), on human voice loudness and fundamental frequency (fo). Even though this has been explained with different oscillatory patterns of the vocal folds, so far, it has not been specifically investigated. In the present work, the influence of three elicitation levels, calibrated sound pressure level (SPL), fo and vowel on the electroglottographic (EGG) and time-differentiated EGG (dEGG) metrics hybrid open quotient (OQ), dEGG OQ and peak dEGG, as well as on the acous-tic metrics CPPS and HNR, was examined, and their suitability for voice assess-ment was evaluated. Method: In a retrospective study, 29 women with a mean age of 25 years (± 8.9, range: 18–53) diagnosed with structural vocal fold pathologies were examined before and after voice therapy or phonosurgery. Both acoustic and EGG signals were recorded simultaneously during the phonation of the sustained vowels /ɑ/, /i/, and /u/ at three elicited levels of loudness (soft/comfortable/loud) and unconstrained fo conditions. Results: A linear mixed-model analysis showed a significant effect of elicitation effort levels on peak dEGG, HNR, and CPPS (all p < .01). Calibrated SPL significantly influenced HNR and CPPS (both p < .01). Furthermore, F0had asignificant effect on peak dEGG and CPPS (p < .0001). All metrics showed significant changes with regard to vowel (all p < .05). However, the treatment had no effect on the examined metrics, regardless of the treatment type (surgery vs. voice therapy). Conclusions: The value of the investigated metrics for voice assessment purposes when sampled without sufficient control of SPL and fo is limited, in that they are significantly influenced by the phonatory context, be it speech or elicited sustained vowels. Future studies should explore the diagnostic value of new data collation approaches such as voice mapping, which take SPL and fo effects into account.

  • 9.
    Jers, Harald
    et al.
    Mannheim University of Music.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Vocal Ensembles: Chapter 202022In: The Oxford Handbook of Music Performance, Volume 2 / [ed] Gary E. McPherson, Oxford University Press , 2022, 1, p. 398-417Chapter in book (Refereed)
    Abstract [en]

    A typical performance situation of a vocal ensemble or choir consists of a group of singers in a room with listeners. The choir singers on stage interact while they sing, since they also hear the sound of the neighboring singer and react accordingly. From a physical point of view, the choir singers can be regarded as sound sources. The properties of the room influence the sound and the listeners perceive the sound event as a sound receiver. Furthermore, the processes in the choir can also be described acoustically, which affects the overall performance. The room influences the timbre of the sound on their way to the audience, the receiver. Reflection, absorption, diffraction, or refraction influence the timbre in the room. The sound in a performance space can be distinguished between a near field very close to the singer and a far field. The distance at which the far field can be assumed is strongly dependent on the acoustics of the room. Especially for singers within a choir, the differentiation between those sound fields is important for hearing oneself and the other singers. The position of the singers, their directivity, and the seating position of the listener in the audience will have an influence on listener perception. Furthermore, this chapter gives background information on intonation and synchronization aspects, which are most relevant for any vocal ensemble situation. Using this knowledge, intuitive behavior and performance practice can be explained and new adaptations can be suggested for singing in vocal ensembles.

  • 10.
    Kittimathaveenan, Kajornsak
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics. KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Speech Communication and Technology.
    Localisation in virtual choirs: outcomes of simplified binaural rendering2023Conference paper (Refereed)
    Abstract [en]

    A virtual choir would find several uses in choral pedagogy and research, but it would need a relatively small computational footprint for wide uptake. On the premise that very accurate localisation might not be needed for virtual rendering of the character of the sound inside an ensemble of singers, a localisation test was conducted using binaural stimuli created using a simplified approach, with parametrically controlled delays and variable low-pass filters (historically known as a ‘shuffler’ circuit) instead of head-related impulse responses. The direct sound from a monophonic anechoic recording of a soprano was processed (1) by sending it to a reverb algorithm for making a room-acoustic diffuse field with unchanging properties, (2) with a second-order low-pass filter with a cut-off frequency descending to 3 kHz for sources from behind, (3) with second-order low-pass head-shading filters with an angle-dependent cut-off frequency for the left/right lateral shadings of the head, and (4) with the gain of the direct sound being inversely proportional to virtual distance. The recorded singer was modelled as always facing the listener; no frequency-dependent directivity was implemented. Binaural stimuli corresponding to 24 different singer positions (8 angles and 3 distances) were synthesized. 30 participants heard the stimuli in randomized order, and indicated the perceived location of the singer on polar plot response sheets, with categories to indicate the possible responses. The listeners’ discrimination of the distance categories 0.5, 1 and 2 meters (1 correct out of 3 possible) was good, at about 80% correct. Discrimination of the angle of incidence, in 45-degreecategories (1 correct out of 8 possible) was fair, at 47% correct. Angle errors were mostly on the ‘cone of confusion’ (back-front symmetry), suggesting that the back-front cue was not very salient. The correct back-front responses (about 50%) dominated only somewhat over the incorrect ones (about 38%). In an ongoing follow-up study, multi-singer scenarios will be tested, and a more detailed yet still parametric filtering scheme will be explored.

    Download full text (pdf)
    fulltext
  • 11.
    Körner Gustafsson, Joakim
    et al.
    Karolinska Institutet.
    Södersten, Maria
    Karolinska Institutet.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Schalling, Ellika
    Karolinska Institutet.
    Treatment of Hypophonia in Parkinson’s Disease Through Biofeedback in Daily Life Administered with A Portable Voice Accumulator2021In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588Article in journal (Refereed)
    Abstract [en]

    Objectives

    The purpose of this study was to assess the outcome following continuous tactile biofeedback of voice sound level administered, with a portable voice accumulator to individuals with Parkinson's disease (PD).

    Method

    Nine out of 16 participants with PD completed a 4-week intervention program where biofeedback of voice sound level was administered with the portable voice accumulator VoxLog during speech in daily life. The feedback, a tactile vibration signal from the device, was activated when the wearer used a voice sound level below an individually predetermined threshold level, reminding the wearer to increase voice sound level during speech. Voice use was registered in daily life with the VoxLog during the intervention period as well as during one baseline week, one follow-up week post intervention and 1 week 3 months post intervention. Self-to-other ratio (SOR), which is the difference between voice sound level and environmental noise, was studied in multiple noise ranges.

    Results

    A significant increase in SOR across all noise ranges of 2.28 dB (SD: 0.55) was seen for participants with scores above the cut-off for normal function (>26 points) on the cognitive screening test Montreal Cognitive Assessment (MoCA) (n = 5). No significant increase was seen for the group of participants with MoCA scores below 26 (n = 4). Forty-four percent ended their participation early, all which scored below 26 on MoCA (n = 7).

    Conclusions

    Biofeedback administered in daily life regarding voice level may help individuals with PD to increase their voice sound level in relation to environmental noise in daily life, but only for a limited subset. Only participants with normal cognitive function as screened by MoCA improved their voice sound level in relation to environmental noise.

  • 12.
    Lindeberg, Tony
    et al.
    KTH, School of Computer Science and Communication (CSC), Computational Science and Technology (CST).
    Friberg, Anders
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Idealized computational models for auditory receptive fields2014Report (Other academic)
    Abstract [en]

    This paper presents a theory by which idealized models of auditory receptive fields can be derived in a principled axiomatic manner, from a set of structural properties to (i) enable invariance of receptive field responses under natural sound transformations and (ii) ensure internal consistency between spectro-temporal receptive fields at different temporal and spectral scales.

    For defining a time-frequency transformation of a purely temporal sound signal, it is shown that the framework allows for a new way of deriving the Gabor and Gamma- tone filters as well as a novel family of generalized Gammatone filters, with additional degrees of freedom to obtain different trade-offs between the spectral selectivity and the temporal delay of time-causal temporal window functions.

    When applied to the definition of a second-layer of receptive fields from a spec- trogram, it is shown that the framework leads to two canonical families of spectro- temporal receptive fields, in terms of spectro-temporal derivatives of either spectro- temporal Gaussian kernels for non-causal time or the combination of a time-causal generalized Gammatone filter over the temporal domain and a Gaussian filter over the logspectral domain. For each filter family, the spectro-temporal receptive fields can be either separable over the time-frequency domain or be adapted to local glissando trans- formations that represent variations in logarithmic frequencies over time. Within each domain of either non-causal or time-causal time, these receptive field families are de- rived by uniqueness from the assumptions.

    It is demonstrated how the presented framework allows for computation of basic auditory features for audio processing and that it leads to predictions about auditory receptive fields with good qualitative similarity to biological receptive fields measured in the inferior colliculus (ICC) and primary auditory cortex (A1) of mammals.

    Download full text (pdf)
    fulltext
  • 13.
    Lã, Filipa M.B.
    et al.
    University of Distance-Learning, MADRID, Spain.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Flow ball-assisted training: immediate effects on vocal fold contacting2019In: Pan-European Voice Conference 2019 / [ed] Jenny Iwarsson, Stine Løvind Thorsen, University of Copenhagen , 2019, p. 50-51Conference paper (Refereed)
    Abstract [en]

    Background: The flow ball is a device that creates a static backpressure in the vocal tract while providing real-time visual feedback of airflow. A ball height of 0 to 10 cm corresponds to airflows of 0.2 to 0.4. L/s. These high airflows with low transglottal pressure correspond to low flow resistances, similar to the ones obtained when phonating into straws with 3.7 mm diameter and of 2.8 cm length. Objectives: To investigate whether there are immediate effects of flow ball-assisted training on vocal fold contact. Methods: Ten singers (five males and five females) performed a messa di voce at different pitches over one octave in three different conditions: before, during and after phonating with a flow ball. For all conditions, both audio and electrolaryngographic (ELG) signals were simultaneously recorded using a Laryngograph microprocessor. The vocal fold contact quotient Qci (the area under the normalized EGG cycle) and dEGGmaxN (the normalized maximum rate of change of vocal fold contact area) were obtained for all EGG cycles, using the FonaDyn system. We introduce also a compound metric Ic ,the ‘index of contact’ [Qci × log10(dEGGmaxN)], with the properties that it goes to zero at no contact. It combines information from both Qci and dEGGmaxN and thus it is comparable across subjects. The intra-subject means of all three metrics were computed and visualized by colour-coding over the fo-SPL plane, in cells of 1 semitone × 1 dB. Results: Overall, the use of flow ball-assisted phonation had a small yet significant effect on overall vocal fold contact across the whole messa di voce exercise. Larger effects were evident locally, i.e., in parts of the voice range. Comparing the pre-post flow-ball conditions, there were differences in Qci and/or dEGGmaxN. These differences were generally larger in male than in female voices. Ic typically decreased after flow ball use, for males but not for females. Conclusion: Flow ball-assisted training seems to modify vocal fold contacting gestures, especially in male singers.

  • 14.
    Lã, Filipa M.B.
    et al.
    University of Distance-Learning, MADRID, Spain.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Flow ball-assisted voice training: Immediate effects on vocal fold contacting2020In: Biomedical Signal Processing and Control, ISSN 1746-8094, E-ISSN 1746-8108, Vol. 62, article id 102064Article in journal (Refereed)
    Abstract [en]

    Objective: Effects of exercises using a tool that promotes a semi-occluded artificially elongated vocal tract with real-time visual feedback of airflow – the flow ball – were tested using voice maps of EGG time-domain metrics. Methods: Ten classically trained singers (5 males and 5 females) were asked to sing messa di voce exercises on eight scale tones, performed in three consecutive conditions: baseline (‘before’), flow ball phonation (‘during’), and again without the flow ball (‘after’). These conditions were repeated eight times in a row: one scale tone at a time, on an ascending whole tone scale. Audio and electroglottographic signals were recorded using a Laryngograph microprocessor. Vocal fold contacting was assessed using three time-domain metrics of the EGG waveform, using FonaDyn. The quotient of contact by integration, Qci, the normalized peak derivative, QΔ, and the index of contacting Ic, were quantified and compared between ‘before’ and ‘after’ conditions. Results: Effects of flow ball exercises depended on singers’ habitual phonatory behaviours and on the position in the voice range. As computed over the entire range of the task, Qci was reduced by about 2% in five of ten singers. QΔ was 2–6% lower in six of the singers, and 3–4% higher only in the two bass-baritones. Ic decreased by almost 4% in all singers. Conclusion: Overall, vocal adduction was reduced and a gentler vocal fold collision was observed for the ‘after’ conditions. Significance: Flow ball exercises may contribute to the modification of phonatory behaviours of vocal pressedness.

  • 15.
    Pabon, Peter
    et al.
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Howard, David M.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Kob, Malte
    Eckel, Gerhard
    KTH, School of Computer Science and Communication (CSC), Media Technology and Interaction Design, MID.
    Future Perspectives2017In: Oxford Handbook of Singing / [ed] Welch, Graham; Howard, David M.; Nix, John, Oxford University Press, 2017, Vol. 1Chapter in book (Other academic)
    Abstract [en]

    This chapter, through examining several emerging or continuing areas of research, serves to look ahead at possible ways in which humans, with the help of technology, may interact with each other vocally as well as musically. Some of the topic areas, such the use of the Voice Range Profile, hearing modeling spectrography, voice synthesis, distance masterclasses, and virtual acoustics, have obvious pedagogical uses in the training of singers. Others, such as the use of 3D printed vocal tracts and computer music composition involving the voice, may lead to unique new ways in which singing may be used in musical performance. Each section of the chapter is written by an expert in the field who explains the technology in question and how it is used, often drawing upon recent research led by the chapter authors.

  • 16.
    Pabon, Peter
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics. Royal Conservatoire, The Hague, Netherlands.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Feature maps of the acoustic spectrum of the voice2020In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 34, no 1, p. 161.e1-161.e26Article in journal (Refereed)
    Abstract [en]

    The change in the spectrum of sustained /a/ vowels was mapped over the voice range from low to high fundamental frequency and low to high sound pressure level (SPL), in the form of the so-called voice range profile (VRP). In each interval of one semitone and one decibel, narrowband spectra were averaged both within and across subjects. The subjects were groups of 7 male and 12 female singing students, as well as a group of 16 untrained female voices. For each individual and also for each group, pairs of VRP recordings were made, with stringent separation of the modal/chest and falsetto/head registers. Maps are presented of eight scalar metrics, each of which was chosen to quantify a particular feature of the voice spectrum, over fundamental frequency and SPL. Metrics 1 and 2 chart the role of the fundamental in relation to the rest of the spectrum. Metrics 3 and 4 are used to explore the role of resonances in relation to SPL. Metrics 5 and 6 address the distribution of high frequency energy, while metrics 7 and 8 seek to describe the distribution of energy at the low end of the voice spectrum.

    Several examples are observed of phenomena that are difficult to predict from linear source-filter theory, and of the voice source being less uniform over the voice range than is conventionally assumed. These include a high-frequency band-limiting at high SPL and an unexpected persistence of the second harmonic at low SPL. The two voice registers give rise to clearly different maps. Only a few effects of training were observed, in the low frequency end below 2 kHz. The results are of potential interest in voice analysis, voice synthesis and for new insights into the voice production mechanism.

  • 17.
    Patel, Rita
    et al.
    University of Indiana.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Non-invasive evaluation of vibratory kinematics of phonation in children2020In: 12th International Conference on Voice Physiology and Biomechanics / [ed] Henrich Bernardoni, N.; Bailly, L., Grenoble, 2020, p. 28-Conference paper (Refereed)
    Abstract [en]

    Developmentally, children do not have five well-defined layers of the vocal folds as adults. Also, children are often less able to tolerate the invasiveness of endoscopy. Electroglottography (EGG) is not invasive, but little is known of how the EGGs of children differs from those of adults. The goal of this study was to quantify some differences in the shape of the EGG waveform between children and adults, while accounting for the sensitivity to variation in the independent variables fo and SPL. A novel mapping of EGG waveform parameters over the speech and voice ranges was employed

  • 18. Rossing, T D
    et al.
    Sundberg, Johan
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Ternström, Sten
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Acoustic comparison of soprano solo and choir singing.1987In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 82, no 3, p. 830-836Article in journal (Refereed)
    Abstract [en]

    Five soprano singers were recorded while singing similar texts in both choir and solo modes of performance. A comparison of long-term-average spectra of similar passages in both modes indicates that subjects used different tactics to achieve somewhat higher concentrations of energy in the 2- to 4-kHz range when singing in the solo mode. It is likely that this effect resulted, at least in part, from a slight change of the voice source from choir to solo singing. The subjects used slightly more vibrato when singing in the solo mode.

  • 19. Rossing, T D
    et al.
    Sundberg, Johan
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Ternström, Sten
    Acoustic comparison of voice use in solo and choir singing.1986In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 79, no 6, p. 1975-1981Article in journal (Refereed)
    Abstract [en]

    An experiment was carried out in which eight bass/baritone singers were recorded while singing in both choral and solo modes. Together with their own voice, they heard the sound of the rest of the choir and a piano accompaniment, respectively. The recordings were analyzed in several ways, including computation of long-time-average spectra for each passage, analysis of the sound levels in the frequency ranges corresponding to the fundamental and the "singer's formant," and a comparison of the sung levels with the levels heard by the singers. Matching pairs of vowels in the two modes were inverse filtered to determine the voice source spectra and formant frequencies for comparison. Differences in both phonation and articulation between the two modes were observed. Subjects generally sang with more power in the singer's formant region in the solo mode and with more power in the fundamental region in the choral mode. Most singers used a reduced frequency distance between the third and fifth formants for increasing the power in the singer's formant range, while the difference in the fundamental was mostly a voice source effect. In a choral singing mode, subjects usually adjusted their voice levels to the levels they heard from the other singers, whereas in a solo singing mode the level sung depended much less on the level of an accompaniment.

  • 20.
    Ternström, Sten
    KTH, Superseded Departments (pre-2005), Speech Transmission and Music Acoustics. KTH, Superseded Departments (pre-2005), Speech, Music and Hearing. KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Choir acoustics: an overview of scientific research published to date2003In: International Journal of Research in Choral Singing, Vol. 1, no 1, p. 3-12Article in journal (Refereed)
    Abstract [en]

    Choir acoustics is but one facet of choir-related research, yet it is one of the most tangible. Several aspects of sound can be measured objectively, and such results can be related to known properties of voices, rooms, ears and musical scores. What follows is essentially an update of the literature overview in my Ph.D. dissertation from 1989 of empirical investigations known to me that deal specifically with the acoustics of choirs, vocal groups, or choir singers. This compilation of sources is no doubt incomplete in certain respects; nevertheless, it will hopefully prove to be useful for researchers and others interested in choir acoustics.

    Download full text (pdf)
    http://www.speech.kth.se/prod/publications/files/qpsr/2002/2002_43_1_001-008.pdf
  • 21.
    Ternström, Sten
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Who is normal, and how can we know?2020In: 12th International Conference on Voice Physiology and Biomechanics / [ed] Henrich Bernardoni, N.; Bailly, L., 2020Conference paper (Other academic)
    Abstract [en]

    The quest for objective physical criteria for normal or aberrant voice has so far been disappoint-ing; on the whole, perceptual evaluations still have greater evidential value. One reason is that current measurement paradigms often result in a severe undersampling, undermining validity. All voice metrics exhibit considerable variation, not only across individuals, but also across the voice range. Making “voice maps” over fo × SPL reveals a surprising amount of variation, even in homogeneous populations of normophonic individuals; so normative averages are not to be found. However, under stringent protocol, individuals do systematically reproduce their own voice maps, as shown by Pabon and others. So, the norm could be the patient herself, if recorded prior to the pathology or the intervention. From comparing voice maps, not to a population norm, but to earlier takes of the same person, specific and detailed conclusions can be drawn.

  • 22.
    Ternström, Sten
    et al.
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Nordmark, Jan
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Intonation preferences for major thirds with non-beating ensemble sounds1996In: Proc. of Nordic Acoustical Meeting: NAM'96, Helsinki, 1996, p. 359-365, article id F2Conference paper (Refereed)
    Abstract [en]

    The frequency ratios, or intervals, of the twelve-tone scale can be mathematically dejned in several slightly diferent ways, each of which may be more or less appropriate in different musical contexts. For maximum mobility in musical key, instruments of our time with fixed tuning are typically tuned in equal temperament, except for performances of early music or avant-garde contemporary music. Some contend that pure intonation, being free of beats, is more natural, and would be preferred in instruments with variable tuning. The sound of choirs is such that beats are very unlikely to serve as cues for intonation. Choral performers have access to variable tuning, yet have not been shown to prefer pure intonation. The difference between alternative intonation schemes is largest for the major third interval. Choral directors and other musically expert subjects were asked to adjust to their preference the intonation of 20 major third intervals in synthetic ensemble sounds. The preferred size of the major third was 395.4 cents, with intra-subject averages ranging from 388 to 407 cents.

    Download full text (pdf)
    fulltext
  • 23.
    Ternström, Sten
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Music Acoustics.
    Pabon, Peter
    Royal Conservatoire, The Hague, NL.
    Voice Maps as a Tool for Understanding and Dealing with Variability in the Voice2022In: Applied Sciences, E-ISSN 2076-3417, Vol. 12, no 22, p. 11353-11353Article in journal (Refereed)
    Abstract [en]

    Individual acoustic and other physical metrics of vocal status have long struggled to prove their worth as clinical evidence. While combinations of metrics or “features” are now being intensely explored using data analytics methods, there is a risk that explainability and insight will suffer. The voice mapping paradigm discards the temporal dimension of vocal productions and uses fundamental frequency (fo) and sound pressure level (SPL) as independent control variables to implement a dense grid of measurement points over a relevant voice range. Such mapping visualizes how most physical voice metrics are greatly affected by fo and SPL, and more so individually than has been generally recognized. It is demonstrated that if fo and SPL are not controlled for during task elicitation, repeated measurements will generate “elicitation noise”, which can easily be large enough to obscure the effect of an intervention. It is observed that, although a given metric’s dependencies on fo and SPL often are complex and/or non-linear, they tend to be systematic and reproducible in any given individual. Once such personal trends are accounted for, ordinary voice metrics can be used to assess vocal status. The momentary value of any given metric needs to be interpreted in the context of the individual’s voice range, and voice mapping makes this possible. Examples are given of how voice mapping can be used to quantify voice variability, to eliminate elicitation noise, to improve the reproducibility and representativeness of already established metrics of the voice, and to assess reliably even subtle effects of interventions. Understanding variability at this level of detail will shed more light on the interdependent mechanisms of voice production, and facilitate progress toward more reliable objective assessments of voices across therapy or training.

1 - 23 of 23
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf