Change search
Refine search result
1 - 50 of 50
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Agelfors, Eva
    et al.
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Beskow, Jonas
    Dahlquist, M
    Granström, Björn
    Lundeberg, M
    Salvi, Giampiero
    Spens, K-E
    Öhman, Tobias
    Two methods for Visual Parameter Extraction in the Teleface Project1999In: Proceedings of Fonetik, Gothenburg, Sweden, 1999Conference paper (Other academic)
  • 2. Ambrazaitis, G.
    et al.
    House, David
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH. KTH, Superseded Departments (pre-2005), Speech, Music and Hearing. KTH, Superseded Departments (pre-2005), Speech Transmission and Music Acoustics.
    Multimodal prominences: Exploring the patterning and usage of focal pitch accents, head beats and eyebrow beats in Swedish television news readings2017In: Speech Communication, ISSN 0167-6393, E-ISSN 1872-7182, Vol. 95, p. 100-113Article in journal (Refereed)
    Abstract [en]

    Facial beat gestures align with pitch accents in speech, functioning as visual prominence markers. However, it is not yet well understood whether and how gestures and pitch accents might be combined to create different types of multimodal prominence, and how specifically visual prominence cues are used in spoken communication. In this study, we explore the use and possible interaction of eyebrow (EB) and head (HB) beats with so-called focal pitch accents (FA) in a corpus of 31 brief news readings from Swedish television (four news anchors, 986 words in total), focusing on effects of position in text, information structure as well as speaker expressivity. Results reveal an inventory of four primary (combinations of) prominence markers in the corpus: FA+HB+EB, FA+HB, FA only (i.e., no gesture), and HB only, implying that eyebrow beats tend to occur only in combination with the other two markers. In addition, head beats occur significantly more frequently in the second than in the first part of a news reading. A functional analysis of the data suggests that the distribution of head beats might to some degree be governed by information structure, as the text-initial clause often defines a common ground or presents the theme of the news story. In the rheme part of the news story, FA, HB, and FA+HB are all common prominence markers. The choice between them is subject to variation which we suggest might represent a degree of freedom for the speaker to use the markers expressively. A second main observation concerns eyebrow beats, which seem to be used mainly as a kind of intensification marker for highlighting not only contrast, but also value, magnitude, or emotionally loaded words; it is applicable in any position in a text. We thus observe largely different patterns of occurrence and usage of head beats on the one hand and eyebrow beats on the other, suggesting that the two represent two separate modalities of visual prominence cuing.

  • 3.
    Askenfelt, Anders
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Vital statistics (Rating the quality of bows for string instruments)2002In: Strad, ISSN 0039-2049, Vol. 113, no 1348, p. 822-+Article in journal (Refereed)
  • 4.
    Askenfelt, Anders
    et al.
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Galembo, A. S.
    Study of the spectral inharmonicity of musical sound by the algorithms of pitch extraction2000In: Acoustical Physics, ISSN 1063-7710, E-ISSN 1562-6865, Vol. 46, no 2, p. 121-132Article in journal (Refereed)
    Abstract [en]

    The algorithms of pitch extraction are widely used in the studies of signals and, specifically, speech signals for the determination of the fundamental frequency. From the previous studies performed by Galembo and the calculations and experiments described in this paper, it follows that these methods can be adapted for the analysis and evaluation of the factors which form the sound property called pitch strength, pitch salience, or intonation clarity. Although this property plays an important role in music, it is quite poorly investigated. One of the aforementioned factors is represented by the distributed spectral inharmonicity which is typical of sounds produced, e.g., by strings. This paper presents a method of visualization, evaluation, and measurement of the inharmonicity of the spectrum of a musical sound with the help of the well-known algorithms of pitch extraction, namely the cepstrum and the harmonic product spectrum.

  • 5. Beaugendre, F.
    et al.
    House, David
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Hermes, D. J.
    Accentuation boundaries in Dutch, French and Swedish2001In: Speech Communication, ISSN 0167-6393, E-ISSN 1872-7182, Vol. 33, no 4, p. 305-318Article in journal (Refereed)
    Abstract [en]

    This paper presents a comparative study investigating the relation between the timing of a rising or falling pitch movement and the temporal structure of the syllable it accentuates for three languages: Dutch, French and Swedish. In a perception experiment, the five-syllable utterances /mamamamama/ and /?a?a?a?a?a/ were provided with a relatively fast rising or falling pitch movement. The timing of the movement was systematically varied so that it accented the third or the fourth syllable, subjects were asked to indicate which syllable they perceived as accented. The accentuation boundary (AB) between the third and the fourth syllable was then defined as the moment before which more than half of the subjects indicated the third syllable as accented and after which more than half of the subjects indicated the fourth syllable. The results show that there are significant differences between the three languages as to the location of the AB. In general, for the rises, well-defined ABs were found. They were located in the middle of the vowel of the third syllable for French subjects, and later in that vowel for Dutch and swedish subjects. For the falls, a clear AB was obtained only for the Dutch and the Swedish listeners. This was located at the end of the third syllable. For the French listeners, the fall did not yield a clear AB, This corroborates the absence of accentuation by means of falls in French. By varying the duration of the pitch movement it could be shown that, in all cases in which a clear AB was found. the cue for accentuation was located at the beginning of the pitch movement.

  • 6.
    Bell, Linda
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Linguistic adaptations in spoken and multimodal dialogue systems2000Licentiate thesis, comprehensive summary (Other scientific)
  • 7.
    Bell, Linda
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Linguistic Adaptations in Spoken Human-Computer Dialogues - Empirical Studies of User Behavior2003Doctoral thesis, monograph (Other scientific)
    Abstract [en]

    This thesis addresses the question of how speakers adapttheir language when they interact with a spoken dialoguesystem. In human–human dialogue, people continuously adaptto their conversational partners at different levels. Wheninteracting with computers, speakers also to some extent adapttheir language to meet (what they believe to be) theconstraints of the dialogue system. Furthermore, if a problemoccurs in the human–computer dialogue, patterns oflinguistic adaptation are often accentuated.

    In this thesis, we used an empirical approach in which aseries of corpora of human–computer interaction werecollected and analyzed. The systems used for data collectionincluded both fully functional stand-alone systems in publicsettings, and simulated systems in controlled laboratoryenvironments. All of the systems featured animated talkingagents, and encouraged users to interact using unrestrictedspontaneous language. Linguistic adaptation in the corpora wasexamined at the phonetic, prosodic, lexical, syntactic andpragmatic levels.

    Knowledge about users’linguistic adaptations can beuseful in the development of spoken dialogue systems. If we areable to adequately describe their patterns of occurrence (atthe different linguistic levels at which they occur), we willbe able to build more precise user models, thus improvingsystem performance. Our knowledge of linguistic adaptations canbe useful in at least two ways: first, it has been shown thatlinguistic adaptations can be used to identify (andsubsequently repair) errors in human–computer dialogue.Second, we can try to subtly influence users to behave in acertain way, for instance by implicitly encouraging a speakingstyle that improves speech recognition performance.

  • 8. Botinis, A.
    et al.
    Granström, Björn
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Mobius, B.
    Developments and paradigms in intonation research2001In: Speech Communication, ISSN 0167-6393, E-ISSN 1872-7182, Vol. 33, no 4, p. 263-296Article, review/survey (Refereed)
    Abstract [en]

    The present tutorial paper is addressed to a wide audience with different discipline backgrounds as well as variable expertise on intonation. The paper is structured into five sections. In Section 1, Introduction, basic concepts of intonation and prosody are summarised and cornerstones of intonation research are highlighted. In Section 2, Functions and forms of intonation, a wide range of functions from morpholexical and phrase levels to discourse and dialogue levels are discussed and forms of intonation with examples from different languages are presented. In Section 3, Modelling and labelling of intonation, established models of intonation as well as labelling systems are presented. In Section 4, Applications of intonation the most widespread applications of intonation and especially technological ones are presented and methodological issues are discussed. In Section 5, Research perspective research avenues and ultimate goals as well as the significance and benefits of intonation research in the upcoming years are outlined.

  • 9.
    Bresin, Roberto
    et al.
    KTH, Superseded Departments (pre-2005), Speech, Music and Hearing. KTH, Superseded Departments (pre-2005), Speech Transmission and Music Acoustics.
    Friberg, Anders
    KTH, Superseded Departments (pre-2005), Speech, Music and Hearing.
    A multimedia environment for interactive music performance1997In: Proceedings of KANSEI - The Technology of Emotion, AIMI International Workshop, 1997, p. 64-67Conference paper (Refereed)
  • 10.
    Dahl, Sofia
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Striking movements: movement strategies and expression in percussive playing2003Licentiate thesis, comprehensive summary (Other scientific)
    Abstract [en]

    This thesis concerns two aspects of movement and performancein percussion playing, First, the playing of an accent, asimple but much used and practised element in drumming, andsecond; the perception and communication of specific emotionalintentions through movements during performances onmarimba.

    Papers I and II investigated the execution andinterpretation of an accent performed for different playingconditions. Players' movements, striking velocities and timingpatterns were studied for different tempi, dynamic levels andstriking surfaces. It was found that the players used differingmovement strategies when playing and that they interpreted theaccent differently, something that was reflected in theirmovement trajectories. Strokes at greater dynamic levels wereplayed from a greater height and with higher strikingvelocities. All players initiated the accented strokes from agreater height, and delivered the accent with increasedstriking velocity compared to unaccented strokes. The intervalbeginning with the accented stroke was also prolonged,generally by delaying the following stroke. Recurrent cyclicpatterns were found in the players' timing performances. In alistening test listeners perceived the strokes groupedaccording to the cyclic patterns.

    Paper III studied how emotional intent was conveyed toobservers through the movements of a marimba player. Apercussionist was filmed when playing a piece with theexpressive intentions Happiness, Sadness, Anger and Fear onmarimba. Observers rated the emotional content and movementcues in the videos clips shown without sound. Results showedthat the observers were able to identify the intentionsSadness, Anger, and Happiness, but not Fear. The rated movementcues showed that an Angry performance was characterized bylarge, fast, uneven and jerky movements, Happy performances bylarge, somewhat fast movements, and Sad performances by small,slow, even, and smooth movements.

    Keywords:drumming, percussion, movement strategies,instrument interaction, timing, accent, movement cues emotionalexpression, gesture.

  • 11.
    Friberg, Anders
    et al.
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Colombo, V.
    Fryden, L.
    Sundberg, J.
    Generating musical performances with Director Musices2000In: Computer music journal, ISSN 0148-9267, E-ISSN 1531-5169, Vol. 24, no 3, p. 23-29Article in journal (Refereed)
  • 12.
    Friberg, Anders
    et al.
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Sundberg, J.
    Fryden, L.
    Music from motion: Sound level envelopes of tones expressing human locomotion2000In: Journal of New Music Research, ISSN 0929-8215, E-ISSN 1744-5027, Vol. 29, no 3, p. 199-210Article in journal (Refereed)
    Abstract [en]

    The common association of music with motion was investigated in a direct way. Could the original motion quality of different gaits be transferred to music and be perceived by a listener? Measurements of the ground reaction force by the foot during different gaits were transferred to sound by using the vertical force curve as sound level envelopes for tones played at different tempi. Three listening experiments assesses the motion quality of the resulting stimuli. In the first experiment, where the listeners were asked to freely describe the tones, 25% of answers were direct references to motion; such answers were more frequent at faster tempi. In the second experiment, where the listeners were asked to describe the motion quality, about half of the answers directly related to motion could be classified as belonging to one of the categories of dancing, jumping, running, walking, or stumbling. Most gait patterns were clearly classified as belonging to one of these categories, independent of presentation tempo. In the third experiment, the listeners were asked to rate the stimuli on 24 adjective scales. A factor analysis yielded four factors that could be interpreted as Swift vs. Solemn (factor 1), Graceful vs. Stamping (factor 2), Limping vs. Forceful (factor 3), and Springy (factor 4, no contrasting adjective). The results from the three experiments were consistent and indicated that each tone (corresponding to a particular gait) could clearly be categorised in terms of motion.

  • 13.
    Friberg, Anders
    et al.
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Sundström, A.
    Swing ratios and ensemble timing in jazz performance: Evidence for a common rhythmic pattern2002In: Music perception, ISSN 0730-7829, E-ISSN 1533-8312, Vol. 19, no 3, p. 333-349Article in journal (Refereed)
    Abstract [en]

    The timing in jazz ensemble performances was investigated in order to approach the question of what makes the music swing. One well-known aspect of swing is that consecutive eighth notes are performed as long-short patterns. The exact duration ratio (the swing ratio) of the long-short pattern has been largely unknown. In this study, the swing ratio produced by drummers on the ride cymbal was measured. Three well-known jazz recordings and a play-along record were used. A substantial and gradual variation of the drummers' swing ratio with respect to tempo was observed. At slow tempi, the swing ratio was as high as 3.5: 1, whereas at fast tempi it reached 1:1. The often-mentioned triple-feel, that is, a ratio of 2:1, was present only at a certain tempo. The absolute duration of the short note in the long-short pattern was constant at about 100 ms for medium to fast tempi, suggesting a practical limit on tone duration that may be due to perceptual factors. Another aspect of swing is the soloist's timing in relation to the accompaniment. For example, a soloist can be characterized as playing behind the beat. In the second part, the swing ratio of the soloist and its relation to the cymbal accompaniment was measured from the same recordings. In slow tempi, the soloists were mostly playing their downbeats after the cymbal but were synchronized with the cymbal at the off-beats. This implied that the swing ratio of the soloist was considerably smaller than the cymbal accompaniment in slow tempi. It may give an impression of playing behind but at the same time keep the synchrony with the accompaniment at the off-beat positions. Finally, the possibilities of using computer tools in jazz pedagogy are discussed.

  • 14.
    Fuks, Leonardo
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    From air to music1998Doctoral thesis, comprehensive summary (Other scientific)
  • 15. Galembo, A.
    et al.
    Askenfelt, Anders
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Cuddy, L. L.
    Russo, F. A.
    Effects of relative phases on pitch and timbre in the piano bass range2001In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 110, no 3, p. 1649-1666Article in journal (Refereed)
    Abstract [en]

    Piano bass tones raise questions related to the perception of multicomponent, inharmonic tones. In this study, the influence of the relative phases among partials on pitch and timbre was investigated for synthesized bass tones with piano-like inharmonicity. Three sets of bass tones (A0 = 27.5 Hz, 100 partials, flat spectral envelope) were generated; harmonic, low inharmonic, and high inharmonic. For each set, five starting phase relations among partials were applied; sine phases, alternate (sine/cosine) phases, random phases, Schroeder phases, and negative Schroeder phases. The pitch and timbre of the tones were influenced markedly by the starting phases. Listening tests showed that listeners are able to discriminate between tones having different starting phase relations, and also that the pitch could be changed by manipulating the relative phases (octave, fifth, major third). A piano-like inharmonicity gives a characteristic randomizing effect of the phase relations over time in tones starting with nonrandom phase relations. A measure of the regularity of the phase differences between adjacent partials is suggested for quantifying this randomization process. The observed phase effects might be of importance in synthesizing, recording, and reproducing piano music.

  • 16. Galembo, A.
    et al.
    Askenfelt, Anders
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Cuddy, L. L.
    Russo, F. A.
    Perceptual relevance of inharmonicity and spectral envelope in the piano bass range2004In: Acta Acoustica united with Acustica, ISSN 1610-1928, E-ISSN 1861-9959, Vol. 90, no 3, p. 528-536Article in journal (Refereed)
    Abstract [en]

    Professionals consider the differences in the timbre of bass tones between large grand pianos and small uprights as significant. By tradition this difference has been attributed mainly to lower inharmonicity in grand pianos, due to longer bass strings. In this study, the importance of the spectral envelope, representing the dynamic balance between high-frequency and low-frequency energy in the spectrum, is contrasted against the importance of the level of inharmonicity. Results from two listening tests indicate that the inharmonicity is less important than the spectrum bandwidth in determining the timbre of piano bass tones.

  • 17.
    Gobl, Christer
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    The Voice Source in Speech Communication - Production and Perception Experiments Involving Inverse Filtering and Synthesis2003Doctoral thesis, comprehensive summary (Other scientific)
    Abstract [en]

    This thesis explores, through a number of production andperception studies, the nature of the voice source signal andhow it varies in spoken communication. Research is alsopresented that deals with the techniques and methodologies foranalysing and synthesising the voice source. The main analytictechnique involves interactive inverse filtering for obtainingthe source signal, which is then parameterised to permit thequantification of source characteristics. The parameterisationis carried by means of model matching, using the four-parameterLF model of differentiated glottal flow.

    The first three analytic studies focus on segmental andsuprasegmental determinants of source variation. As part of theprosodic variation of utterances, focal stress shows for theglottal excitation an enhancement between the stressed voweland the surrounding consonants. At a segmental level, the voicesource characteristics of a vowel show potentially majordifferences as a function of the voiced/voiceless nature of anadjacent stop. Cross-language differences in the extent anddirectionality of the observed effects suggest differentunderlying control strategies in terms of the timing of thelaryngeal and supralaryngeal gestures, as well as in thelaryngeal tensions settings. Different classes of voicedconsonants also show differences in source characteristics:here the differences are likely to be passive consequences ofthe aerodynamic conditions that are inherent to the consonants.Two further analytic studies present voice source correlatesfor six different voice qualities as defined by Laver'sclassification system. Data from stressed and unstressedcontexts clearly show that the transformation from one voicequality to another does not simply involve global changes ofthe source parameters. As well as providing insights into theseaspects of speech production, the analytic studies providequantitative measures useful in technology applications,particularly in speech synthesis.

    The perceptual experiments use the LF source implementationin the KLSYN88 synthesiser to test some of the analytic resultsand to harness them to explore the paralinguistic dimension ofspeech communication. A study of the perceptual salience ofdifferent parameters associated with breathy voice indicatesthat the source spectral slope is critically important andthat, surprisingly, aspiration noise contributes relativelylittle. Further perceptual tests using stimuli with differentvoice qualities explore the mapping between voice quality andits paralinguistic function of expressing emotion, mood andattitude. The results of these studies highlight the crucialrole of voice quality in expressing affect as well as providingpointers to how it combines withf0for this purpose.

    The last section of the thesis focuses on the techniquesused for the analysis and synthesis of the source. Asemi-automatic method for inverse filtering is presented, whichis novel in that it optimises the inverse filter by exploitingthe knowledge that is typically used by the experimenter whencarrying out manual interactive inverse filtering. A furtherstudy looks at the properties of the modified LF model in theKLSYN88 synthesiser: it highlights how it differs from thestandard LF model and discusses the implications forsynthesising the glottal source signal from LF model data.Effective and robust source parameterisation for the analysisof voice quality is the topic of the final paper: theeffectiveness of global, amplitude-based, source parameters isexamined across speech tokens with large differences inf0. Additional amplitude-based parameters areproposed to enable a more detailed characterisation of theglottal pulse.

    Keywords:Voice source dynamics, glottal sourceparameters, source-filter interaction, voice quality,phonation, perception, affect, emotion, mood, attitude,paralinguistic, inverse filtering, knowledge-based, formantsynthesis, LF model, fundamental frequency,f0.

  • 18. Goebl, W.
    et al.
    Bresin, Roberto
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Measurement and reproduction accuracy of computer-controlled grand pianos2003In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 114, no 4, p. 2273-2283Article in journal (Refereed)
    Abstract [en]

    The recording and reproducing capabilities of a Yamaha Disklavier grand piano and a Bosendorfer SE290 computer-controlled grand piano were tested, with the goal of examining their reliability for performance research. An experimental setup consisting of accelerometers and a calibrated microphone was used to capture key and hammer movements, as well as the acoustic signal. Five selected keys were played by pianists with two types of touch (staccato and legato). Timing and dynamic differences between the original performance, the corresponding MIDI file recorded by the computer-controlled pianos, and its reproduction were analyzed. The two devices performed quite differently with respect to timing and dynamic accuracy. The Disklavier's onset capturing was slightly more precise (+/-10 ms) than its reproduction (-20 to +30 ms); the Bosendorfer performed generally better, but its timing accuracy was slightly less precise for recording (-10 to 3 ms) than for reproduction (+/-2 ms). Both devices exhibited a systematic (linear) error in recording over time. In the dynamic dimension, the Bosendorfer showed higher consistency over the whole dynamic range, while the Disklavier performed well only in a wide middle range. Neither device was able to capture or reproduce different types of touch.

  • 19.
    Granqvist, Svante
    et al.
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Hertegård, S.
    Larsson, H.
    Sundberg, J.
    Simultaneous analysis of vocal fold vibration and transglottal airflow: Exploring a new experimental setup2003In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 17, no 3, p. 319-330Article in journal (Refereed)
    Abstract [en]

    The purpose of this study was to develop an analysis system for studying the relationship between vocal fold vibration and the associated transglottal airflow. Recordings of airflow, electroglottography (EGG), oral air pressure, and acoustic signals were performed simultaneously with highspeed imaging at a rate of approximately 1900 frames/s. Inverse filtered airflow is compared with the simultaneous glottal area extracted from the highspeed image sequence. The accuracy of the synchronization between the camera images and the foot pedal synchronization pulse was examined, showing that potential synchronization errors increase with time distance to the synchronization pulse. Therefore, analysis was limited to material near the synchronization pulse. Results corroborate previous predictions that air flow lags behind area, but also they reveal that relationships between these two entities may be complex and apparently varying with phonation mode.

  • 20.
    Granqvist, Svante
    et al.
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Sundberg, Johan
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Wetzberg, J. E.
    Lundberg, J.
    Acoustic modeling of NO gas evacuation from the maxillar sinuses2004Conference paper (Refereed)
  • 21.
    Granström, Björn
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Towards a virtual language tutor2004In: Proc InSTIL/ICALL2004 NLP and Speech Technologies in Advanced Language Learning / [ed] Delmonte, R.; Delcloque, P.; Tonellli, S., Venice, Italy, 2004, p. 1-8Conference paper (Other academic)
    Abstract [en]

    In this paper we present some work aiming atcreating a virtual language tutor. The ambitionis to create a tutor that can be engaged inmany aspects of language learning from detailedpronunciation training to conversationalpractise. Some of the crucial componentsof such a system are described. An initialimplementation of a stress/quantitytraining tutor for Swedish will be presented.

  • 22.
    Guettler, Knut
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Development of Helmholtz motion and related wave patterns in the bowed string2001Licentiate thesis, comprehensive summary (Other scientific)
    Abstract [en]

    Of the many wave patterns that the bowed string is capableof producing, the so-called‘Helmholtz motion’(Helmholtz 1862) gives the fullest sound in terms of power andovertone richness. Papers one and two of this thesis deal withthe creation of this particular string movement: The firstpaper, based on computer analysis, describes some systemparameters’influence on the transient in terms of‘playability’. The second paper deals with theperception of real violin attacks of different transientqualities. Not surprisingly, it can be shown that tone onsetsare considered superior when the attack noise has a verylimited duration. However, the character of the noise plays animportant part too, as the listener’s tolerance of noisein terms of duration is almost twice as great for‘slipping noise’as for‘creaks’or‘raucousness’during the tone onsets. The third paperdescribes the triggering mechanics of a peculiar toneproduction referred to as‘Anomalous Low Frequencies’(ALF). If properly skilled, a player can achieve pitches belowthe normal range of the instrument. This phenomenon, analysedand explained through use of computer simulations, is relatedto triggering waves taking‘an extra turn’on thestring before causing the string’s release from thebow-hair grip. Since both transverse and torsional propagationspeeds are involved, two different sets of‘sub-ranged’notes can be produced this way.

  • 23.
    Guettler, Knut
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    The bowed string2002Doctoral thesis, comprehensive summary (Other scientific)
    Abstract [en]

    Of the many waveforms the bowed string can assume, theso-called "Helmholtz motion" (Helmholtz 1862) gives the fullestsound in terms of power and overtone richness. The developmentof this steady-state oscillation pattern can take manydifferent paths, most of which would include noise caused bystick-slip irregularities of the bow-string contact. Of thefive papers included in the thesis, the first one shows, notsurprisingly, that tone onsets are considered superior when theattack noise has a very limited duration. It was found,however, that in this judgment thecharacterof the noise plays an important part, as thelistener’s tolerance of noise in terms of duration isalmost twice as great for "slipping noise" as for "creaks" or"raucousness" during the tone onsets. The three followingpapers contain analyses focusing on how irregular slip-sticktriggering may be avoided, as is quite often the case inpractical playing by professionals. The fifth paper describesthe triggering mechanism of a peculiar tone production referredto as "Anomalous Low Frequencies" (ALF). If properly skilled, aplayer can achieve pitches below the normal range of theinstrument. This phenomenon is related to triggering wavestaking "an extra turn" on the string before causing thestring’s release from the bow-hair grip. Since transverseand torsional propagation speeds are both involved, twodifferent sets of "sub-ranged" notes can be produced this way.In the four last papers wave patterns are analysed andexplained through the use of computer simulations.

    Key words:

    Key words:

    Bowed string, violin, musicalacoustics, musical transient, anomalous low frequencies,Helmholtz motion

  • 24. Hertegård, S.
    et al.
    Granqvist, Svante
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Lindestad, P. A.
    Botulinum toxin injections for essential voice tremor2000In: Annals of Otology, Rhinology and Laryngology, ISSN 0003-4894, E-ISSN 1943-572X, Vol. 109, no 2, p. 204-209Article in journal (Refereed)
    Abstract [en]

    Fifteen patients, 13 women and 2 men, with a mean age of 72.7 years (56 to 86 years) and a clinical diagnosis of essential voice tremor, were treated with botulinum injections to the thyroarytenoid muscles, and in some cases, to the cricothyroid or thyrohyoid muscles. Evaluations were based on subjective judgments by die patients, and on perceptual and acoustic analysis of voice recordings. Subjective evaluations indicated that the treatment had a beneficial effect in 678 of the patients. Perceptual evaluations showed a significant decrease in voice tremor during connected speech (p < .05). Acoustic analysis showed a nearly significant decrease in the fundamental frequency variations (p = .06) and a significant decrease in fundamental frequency during sustained vowel phonation (p <. 01). The results of perceptual evaluation coincided most closely with the subjective judgments. It was concluded that the treatment was successful in 50% to 65% of the patients, depending on the method of evaluation.

  • 25. Hiraga, Rumi
    et al.
    Bresin, Roberto
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Hirata, Keiji
    Katayose, Haruhiro
    Rencon 2004: Turing Test for Musical Expression2004In: Proceedings of the 4th international conference on New interfaces for musical expression / [ed] Lyons, Michael J., Hamamatsu, Shizuoka, Japan: National University of Singapore , 2004, p. 120-123Conference paper (Refereed)
    Abstract [en]

    Rencon is an annual international event that started in 2002.It has roles of (1) pursuing evaluation methods for systemswhose output includes subjective issues, and (2) providinga forum for researches of several &#64257;elds related to musicalexpression. In the past, Rencon was held as a workshop associated with a musical contest that provided a forum forpresenting and discussing the latest research in automaticperformance rendering. This year we introduce new evaluation methods of performance expression to Rencon: a TuringTest and a Gnirut Test, which is a reverse Turing Test, forperformance expression. We have opened a section of thecontests to any instruments and genre of music, includingsynthesized human voices.

  • 26.
    Lindberg, Johan
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Speech technology for secure transactions2000Licentiate thesis, comprehensive summary (Other scientific)
  • 27.
    Lindberg, Nikolaj
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Data driven methods in natural language processing: two applications2000Licentiate thesis, comprehensive summary (Other scientific)
  • 28. Lindestad, P. A.
    et al.
    Södersten, M.
    Merker, B.
    Granqvist, Svante
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Voice source characteristics in mongolian throat singing studied with high-speed imaging technique, acoustic spectra, and inverse filtering2001In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 15, no 1, p. 78-85Article in journal (Refereed)
    Abstract [en]

    Mongolian throat singing can be performed in different modes. In Mongolia, the bass-type is called Kargyraa. The voice source in bass-type throat singing was studied in one male singer. The subject alternated between modal voice and the throat singing mode. Vocal fold vibrations were observed with high-speed photography, using a computerized recording system. The spectral characteristics of the sound signal were analyzed. Kymographic image data were compared to the sound signal and flow inverse filtering data from the same singer were obtained on a separate occasion. It was found that the vocal folds vibrated at the same frequency throughout both modes of singing. During throat singing the ventricular folds vibrated with complete but short closures at half the frequency of the true vocal folds, covering every second vocal fold closure. Kymographic data confirmed the findings. The spectrum contained added subharmonics compared to modal voice. In the inverse filtered signal the amplitude of every second airflow pulse was considerably lowered. The ventricular folds appeared to modulate the sound by reducing the glottal flow of every other vocal fold vibratory cycle.

  • 29.
    Megyesi, Beata
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Data-driven syntactic analysis2002Doctoral thesis, monograph (Other scientific)
  • 30.
    Nilsonne, Åsa
    et al.
    Karolinska Institute.
    Sundberg, Johan
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Ternström, Sten
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Askenfelt, Anders
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Music Acoustics.
    Measuring the rate of change of voice fundamental frequency in fluent speech during mental depression1988In: The Journal of the Acoustical Society of America, Vol. 83, no 2, p. 716-728Article in journal (Refereed)
    Abstract [en]

    A method of measuring the rate of change of fundamental frequency has been developed in an effort to find acoustic voice parameters that could be useful in psychiatric research. A minicomputer program was used to extract seven parameters from the fundamental frequency contour of tape‐recorded speech samples: (1) the average rate of change of the fundamental frequency and (2) its standard deviation, (3) the absolute rate of fundamental frequency change, (4) the total reading time, (5) the percent pause time of the total reading time, (6) the mean, and (7) the standard deviation of the fundamental frequency distribution. The method is demonstrated on (a) a material consisting of synthetic speech and (b) voice recordings of depressed patients who were examined during depression and after improvement.

  • 31.
    Prame, Eric
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Vibrato and intonation in classical singing2000Licentiate thesis, comprehensive summary (Other scientific)
  • 32. Rocchesso, D.
    et al.
    Bresin, Roberto
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Fernström, M.
    Sounding objects2003In: IEEE Multimedia, ISSN 1070-986X, E-ISSN 1941-0166, Vol. 10, no 2, p. 42-52Article in journal (Refereed)
    Abstract [en]

    Interactive systems, virtual environments, and information display applications need dynamic sound models rather than faithful audio reproductions. This implies three levels of research: auditory perception, physics-based sound modeling, and expressive parametric control. Parallel progress along these three lines leads to effective auditory displays that can complement or substitute visual displays.

  • 33. Rossing, T D
    et al.
    Sundberg, Johan
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Ternström, Sten
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Acoustic comparison of soprano solo and choir singing.1987In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 82, no 3, p. 830-836Article in journal (Refereed)
    Abstract [en]

    Five soprano singers were recorded while singing similar texts in both choir and solo modes of performance. A comparison of long-term-average spectra of similar passages in both modes indicates that subjects used different tactics to achieve somewhat higher concentrations of energy in the 2- to 4-kHz range when singing in the solo mode. It is likely that this effect resulted, at least in part, from a slight change of the voice source from choir to solo singing. The subjects used slightly more vibrato when singing in the solo mode.

  • 34.
    Smeds, Karolina
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Cochlear hearing loss and evaluation of prescriptive methods for non-linear hearing aids2000Licentiate thesis, comprehensive summary (Other scientific)
  • 35.
    Sohlström, Hans
    KTH, School of Electrical Engineering (EES), Microsystem Technology (Changed name 20121201). KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    KONTAKTMIKROFONER OCH STÖRNINGSREDUCERANDE MIKROFONER: Särskilt med tanke pä automatisk taligenkänning1977Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
    Abstract [sv]

    När taligenkänningssystem ska användas praktiskt är det ofta i bullriga omgivningar. Undersökningen är ett försök att teoretiskt och praktiskt jämföra olika typer av störningsreducerande mikrofoner. Jämförelsen gäller både ljudkvaliteten i allmänhet och mikrofonernas egenskaper för automatisk taligenkänning i synnerhet.

    Ljudkvaliteten har undersökts dels genom mätningar i ekofritt rum, dels genom studier av mikrofonernas talåtergivning.

    Det senare har skett med hjälp av olika typer av spektrogram för tal, upptaget via de olika mikrofonerna.

    Mikrofonernas lämplighet i samband med automatisk tal igenkänning har provats genom igenkänningsförsök, både med och utan bakgrundsbuller.

    De jämförda mikrofontyperna är ; en riktmikrofon, två närmikrofoner av tryckgradienttyp samt en kontaktmikrofon. För den senare har olika placeringar provats. Kontaktmikrofonen har ägnats speciell uppmärksamhet eftersom denrepresenterar en helt annan princip än de andra mikrofonerna.

    Användning av kontaktmikrofon ger en förvrängning av talet, vilken beror på mikrofonens placering. I undersökningen har två placeringar använts, på halsen och på pannan.

    Spektrogram för ett urval av fonem, inspelade dels via kontaktmikrofon i dessa placeringar, dels via en normal referensmikrofon finns samlade i APPENDIX efter sid 133.

    Kontaktfuikrofonplaceringen på halsen visade sig ge en återgivning tillräckligt naturlig både för allmän användning och för automatisk talignkänning. Störningsundertryckningen var betydligt bättre än med de konventionella mikrofonerna. De konventionella mikrofonerna undertryckte lågfrekvent buller relativt dåligt. Detta gav problem vid taligenkänningen.

  • 36.
    Sohlström, Hans
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    NOISE CANCELLING MICROPHONES FOR AUTOMATIC SPEECH RECOGNITION1978Chapter in book (Other academic)
    Abstract [en]

    Automatic speech recognition as well as man-to-man communications in noisy environments require noise cancelling microphones. A number of such microphones are studied. Special attention is given to a contact microphone. The test procedure is described and the results are discussed. The contact microphone is found to give better sound quality than expected.

  • 37.
    Ström, Nikko
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Automatic continous speech recognition with rapid speaker adaptation for human/machine interaction1997Doctoral thesis, comprehensive summary (Other scientific)
  • 38. Sundberg, J.
    et al.
    Friberg, Anders
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Bresin, Roberto
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Attempts to reproduce a pianist's expressive timing with director musices performance rules2003In: Journal of New Music Research, ISSN 0929-8215, E-ISSN 1744-5027, Vol. 32, no 3, p. 317-325Article in journal (Refereed)
    Abstract [en]

    The Director Musices generative grammar of music performance is a system of context dependent rules that automatically introduces expressive deviation in performances of input score files. A number of these rule concern timing. In this investigation the ability of such rules to reproduce a professional pianist's timing deviations from nominal note inter-onset-intervals is examined. Rules affecting tone inter-onset-intervals were first tested one by one for the various sections of the excerpt, and then in combinations. Results were evaluated in terms-of the correlation between the deviations made by the pianist and by the rule system. It is found that rules reflecting the phrase structure produced high correlations in some sections. On the other hand, some rules failed to produce significant correlation with the pianist's deviations, and thus seemed irrelevant to the particular performance analysed. It is concluded that phrasing was a prominent principle in this performance and that rule combinations have to change between sections in order to match this pianist's deviations.

  • 39.
    Sundberg, Johan
    et al.
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Ternström, Sten
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Perkins, William H
    University of Southern California.
    Gramming, Patricia
    Malmö General Hospital.
    Long-time average spectrum analysis of phonatory effects of noise and filtered auditory feedback1988In: Journal of Phonetics, ISSN 0095-4470, E-ISSN 1095-8576, Vol. 16, p. 203-219Article in journal (Refereed)
  • 40. Södersten, M.
    et al.
    Granqvist, Svante
    KTH, Superseded Departments (pre-2005), Speech Transmission and Music Acoustics.
    Hammarberg, B.
    Szabo, Annika
    Karolinska Institute, Sweden.
    Vocal behavior and vocal loading factors for preschool teachers at work studied with binaural DAT recordings2002In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 16, no 3, p. 356-371Article in journal (Refereed)
    Abstract [en]

    Preschool teachers are at risk for developing voice problems such as vocal fatigue and vocal nodules. The purpose of this report was to study preschool teachers' voice use during work. Ten healthy female preschool teachers working at daycare centers (DCC) served as subjects. A binaural recording technique was used. Two microphones were placed on both sides of the subject's head, at equal distance from the mouth, and a portable DAT recorder was attached to the subject's waist. Recordings were made of a standard reading passage before work (baseline) and of spontaneous speech during work. The recording technique allowed separate analyses of the level of the background noise, and of the subjects' voice sound pressure level, mean fundamental frequency, and total phonation time. Among the results, mean background noise level for the ten DCCs was 76.1 dBA (range 73.0-78.2), which is more than 20 dB higher than what is recommended where speech communication is important (50-55 dBA). The subjects spoke on an average of 9.1 dB louder (p < 0.0001), and with higher mean fundamental frequency (247 Hz) during work as compared to the baseline (202 Hz) (p < 0.0001). Mean phonation time for the group was 17%, which was considered high. It was concluded that preschool teachers do have a highly vocally demanding profession. Important steps to reduce the vocal loading for this occupation would be to decrease the background noise levels and include pauses so that preschool teachers can rest their voices.

  • 41.
    Ternström, Sten
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Hearing myself with others: sound levels in choral performance measured with separation of one's own voice from the rest of the choir.1994In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 8, no 4, p. 293-302Article in journal (Refereed)
    Abstract [en]

    The choir singer has two acoustic signals to attend to: the sound of his or her own voice (feedback), and the sound of the rest of the choir (reference). The balance in loudness between feedback and reference is governed mainly by the room acoustics. Although earlier experiments have shown that singers have a fairly large tolerance for imbalance, with references ranging from -23 to +5 dB, experience suggests that, when singers are given control over this parameter, their preferences are much narrower. A quantification of the optimum balance would be useful in the design of concert stages and rehearsal halls. A method is described for measuring the feedback and reference levels as experienced by singers under live performance conditions. Recordings were made using binaural microphones worn by choir singer subjects. With the given combination of choir and room, it was possible to achieve adequate separation of the feedback and reference signals with simple signal processing. The feedback-to-reference ratio averaged over the 12 singers was found to be +3.9 dB, with extremes of +1.5 and +7.3 dB.

  • 42.
    Ternström, Sten
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Perceptual evaluations of voice scatter in unison choir sounds.1993In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 7, no 2, p. 129-135Article in journal (Refereed)
    Abstract [en]

    The preferences of experiences listerners for pitch and formant frequency dispersion in unison choir sounds were explored using synthesized stimuli. Two types of dispersion were investigated: (a) pitch scatter, which arises when voices in an ensemble exhibit small differences in mean fundamental frequency, and (b) spectral smear, defined as such dispersion of formants 3 to 5 as arises from differences in vocal tract length. Each stimulus represented a choir section of five bass, tenor, alto, or soprano voices, producing the vowel [u], [a], or [ae]. Subjects chose one dispersion level out of six available, selecting the "maximum tolerable" in a first run and the "preferred" in a second run. The listeners were very different in their tolerance for dispersion. Typical scatter choices were 14 cent standard deviation for "tolerable" and 0 or 5 cent for "preferred." The smear choices were less consistent; the standard deviations were 12 and 7%, respectively. In all modes of assessment, the largest dispersion was chosen for the vowel [u] on a bass tone. There was a vowel effect on the smear choices. The effects of voice category were not significant.

  • 43.
    Ternström, Sten
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Physical and acoustic factors that interact with the singer to produce the choral sound1991In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 5, no 2, p. 128-143Article in journal (Refereed)
    Abstract [en]

    Most of the people who perform music do so in the capacity of choir singers. An understanding of the particular acoustic properties of the choral sound is of interest not only to performers, but also to educators, architectural acousticians, audio technicians, and composers. The goal of choir acoustics is to describe various aspects of choral sound in acoustic terms, thereby taking into account the acoustics of voice production, the acoustics of rooms, and psychoacoustic properties of the auditory system. This article is an overview of choir acoustics research done in Stockholm over the past 8 years. It is an abridged and adapted version of an overview given in the author’s dissertation, Acoustical Aspects of Choir Singing. Three different kinds of experiments were made: (a) the control of phonation frequency and the vowel articulation of choir singers were investigated, by having individual choir singers perform vocal tasks on demand or in response to auditory stimuli; (b) typical values of sound levels, phonation frequency scatter, and long-time averaged spectra were obtained by measurements on choir singers rehearsing in ensemble under normal or near-normal conditions; and (c) models for certain aspects of choral sound were formulated and evaluated by synthesis. The choir singer’s performance is based on two acoustic signals: her or his own voice (the feedback) and the rest of the choir (the reference). Intonation errors were found to be induced or increased by (a) large level differences between the feedback and the reference, (b) several perceptually unfavorable spectral properties of the reference, and (c) articulatory maneuvers, i.e., intrinsic pitch. The magnitude of the errors would be indirectly related to room acoustics (a and b) and to voice usage and musical/textual content (b and c). When singing alone, singers from one choir used a vowel articulation that was different from that in speech and also more unified; it was also in some respects different from solo singing. Long-time average spectrum effects of room acoustics and musical dynamics were large, as expected; those of choir type and musical material were smaller. To some extent, choirs adapted their sound level and voice usage to the room acoustics. Small random fluctuations in phonation frequency, called "€œflutter"€ and "€œwow,"€ are always present in human voices. With multiple voices, flutter and wow cause, through interference, a pseudorandom, independent amplitude modulation of partial tones, which is known to cue the perceptual "€œchorus effect." The chorus effect is also influenced by the reverberation properties of the room. Choral sounds were explored by means of synthesis, and the importance of realistic flutter was established. Flutter in choir singers was analyzed and simulated in single synthesized voices. Expert listeners were unable to discriminate between simulated and authentic flutter.

  • 44.
    Ternström, Sten
    et al.
    KTH, Superseded Departments (pre-2005), Speech Transmission and Music Acoustics. KTH, Superseded Departments (pre-2005), Speech, Music and Hearing.
    Friberg, Anders
    KTH, Superseded Departments (pre-2005), Speech, Music and Hearing.
    Analysis and simulation of small variations in the fundamental frequency of sustained vowels1989In: STL-QPSR, Vol. 30, no 3, p. 001-014Article in journal (Other academic)
  • 45.
    Ternström, Sten
    et al.
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Sundberg, Johan
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Formant frequencies of choir singers1989In: The Journal of the Acoustical Society of America, Vol. 86, no 2, p. 517-522Article in journal (Refereed)
    Abstract [en]

    The four lowest formant frequencies were measured in eight members of the bass section of a good amateur choir under two conditions: (1) when reading the text of a poem aloud; and (2) when performing the same text as a song. Certain formant frequency differences were observed that were similar to those previously found between professional singers’ spoken and sung vowels. In singing, the intersubject scatter of the three lowest formant frequencies was smaller, and the fourth formant was lower.

  • 46.
    Ternström, Sten
    et al.
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Sundberg, Johan
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Intonation precision of choir singers1988In: The Journal of the Acoustical Society of America, Vol. 84, no 1, p. 59-69Article in journal (Refereed)
  • 47.
    Ternström, Sten
    et al.
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Sundberg, Johan
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Colldén, A
    Articulatory Fo perturbations and auditory feedback1988In: Journal of speech and hearing research, ISSN 0022-4685, Vol. 31, no 2, p. 187-192Article in journal (Refereed)
    Abstract [en]

    Singers are required to sing with a high degree of precision of fundamental frequency (Fo). Does this mean that they have learned to compensate for the change of pitch that has been described in speech during production of different vowels? Experienced choir singers sang sustained tones with a change of vowel in mid-tone. The fundamental frequency was measured, and the resulting Fo contours were evaluated with respect to Fo effects coincident with the vowel changes. The tasks were performed both with normal auditory feedback and with the auditory feedback masked by noise in headphones. The vowels (i) and (y) were found to be associated with higher Fo than other vowels. The irregularities in the Fo curves were somewhat larger in the absence of auditory feedback. This is consistent with findings during speech production. The instability in Fo, measured as the standard deviation over each tone, was also larger in the absence of feedback.

  • 48.
    Ternström, Sten
    et al.
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Södersten, M.
    Bohman, M.
    Cancellation of simulated environmental noise as a tool for measuring vocal performance during noise exposure2002In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 16, no 2, p. 195-206Article in journal (Refereed)
    Abstract [en]

    It can be difficult for the voice clinician to observe or measure how a patient uses his voice in a noisy environment. We consider here a novel method for obtaining this information in the laboratory. Worksite noise and filtered white noise were reproduced over high-fidelity loudspeakers. In this noise, I I subjects read an instructional text of 1.5 to 2 minutes duration, as if addressing a group of people. Using channel estimation techniques, the site noise was suppressed from the recording, and the voice signal alone was recovered. The attainable noise rejection is limited only by the precision of the experimental setup, which includes the need for the subject to remain still so as not to perturb the estimated acoustic channel. This feasibility study, with 7 female and 4 male subjects, showed that small displacements of the speaker's body, even breathing, impose a practical limit on the attainable noise rejection. The noise rejection was typically 30 dB and maximally 40 dB down over the entire voice spectrum. Recordings thus processed were clean enough to permit voice analysis with the long-time average spectrum and the computerized phonetogram. The effects of site noise on voice sound pressure level, fundamental frequency, long-term average spectrum centroid, phonetogram area, and phonation time were much as expected, but with some interesting differences between females and males.

  • 49.
    Thomasson, Monica
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    From Air to Aria. Relevance of Respiratory Behaviour to Voice Function in Classical Western Vocal Art.2003Doctoral thesis, comprehensive summary (Other scientific)
    Abstract [en]

    While previous studies of opera singers’respiratorybehaviour have focused on kinematic or dynamic aspects mainly,this thesis attempts to adopt a broader perspective. Not onlylung volumes, rib cage and abdominal wall kinematics areconsidered, but also the effects of lung volumes and respiratorybehaviour on phonation characteristics. Also, we attempted to payparticular attention to musically related factors.

    Statistical data on opera singers’initiation andtermination lung volumes, breath group volumes and flow rates,all related to vital capacity, were gathered. Consistency ofphonatory and inhalatory respiratory behaviour was estimated, aswell as rib cage and abdominal wall influence on lung volumechange. The singers were found to perform songs from theirrepertoire in aquasi-realistic concert situation. Further, theeffect of lung volume on voice function was studied in non-singersubjects and professional male opera singers’habitualbehaviour and non-habitual inhalatory behaviour. Comparisonsbetween high and low lung volume of vertical laryngeal position,subglottal pressure, and voice source characteristics were made.In addition, the effect of two polar inhalatory behaviours on thesame voice function parameters was examined.

    When performing songs and arias from their repertoire, theprofessional opera singers used the full range of lung volumes,likely to affect all lung volume dependent mechanisms involved inrespiratory control. Even though displaying different behaviours,they were highly consistent within their individual breathingpatterns, especially so with regard to rib cage movement and lungvolume change. Lung volume was found to affect voice function innon-singer subjects, such that the overall glottal adduction wassmaller at high than at low lung volume. When using anon-habitual inhalatory behaviour, the singers’voicefunction was qualitatively affected in a similar manner as thatobserved in non-singers. However, the singers’habitualbreathing behaviour seemed to include a strategy that eitherinhibits or minimises the influence of lung volume dependentmechanisms, such that these effects are reduced and presumablyperceptually irrelevant. The polar inhalatory abdominal wallbehaviours had no effect on voice function.

    Keywords:Abdominal wall, breathing, consistency,inhalation, lung volume, respiratory behaviour, phonation, ribcage, singers, singing, subglottal pressure, tracheal pull,vertical laryngeal position, voice source.

  • 50.
    Öhman, Tobias
    KTH, Superseded Departments, Speech Transmission and Music Acoustics.
    Vision in speech technology: automatic measurements of visual speech audiovisual intelligibility of synthetic and natural faces2000Licentiate thesis, comprehensive summary (Other scientific)
1 - 50 of 50
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf