Endre søk
Link to record
Permanent link

Direct link
BETA
Publikasjoner (10 av 82) Visa alla publikasjoner
Ambrazaitis, G. & House, D. (2017). Multimodal prominences: Exploring the patterning and usage of focal pitch accents, head beats and eyebrow beats in Swedish television news readings. Speech Communication, 95, 100-113
Åpne denne publikasjonen i ny fane eller vindu >>Multimodal prominences: Exploring the patterning and usage of focal pitch accents, head beats and eyebrow beats in Swedish television news readings
2017 (engelsk)Inngår i: Speech Communication, ISSN 0167-6393, E-ISSN 1872-7182, Vol. 95, s. 100-113Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

Facial beat gestures align with pitch accents in speech, functioning as visual prominence markers. However, it is not yet well understood whether and how gestures and pitch accents might be combined to create different types of multimodal prominence, and how specifically visual prominence cues are used in spoken communication. In this study, we explore the use and possible interaction of eyebrow (EB) and head (HB) beats with so-called focal pitch accents (FA) in a corpus of 31 brief news readings from Swedish television (four news anchors, 986 words in total), focusing on effects of position in text, information structure as well as speaker expressivity. Results reveal an inventory of four primary (combinations of) prominence markers in the corpus: FA+HB+EB, FA+HB, FA only (i.e., no gesture), and HB only, implying that eyebrow beats tend to occur only in combination with the other two markers. In addition, head beats occur significantly more frequently in the second than in the first part of a news reading. A functional analysis of the data suggests that the distribution of head beats might to some degree be governed by information structure, as the text-initial clause often defines a common ground or presents the theme of the news story. In the rheme part of the news story, FA, HB, and FA+HB are all common prominence markers. The choice between them is subject to variation which we suggest might represent a degree of freedom for the speaker to use the markers expressively. A second main observation concerns eyebrow beats, which seem to be used mainly as a kind of intensification marker for highlighting not only contrast, but also value, magnitude, or emotionally loaded words; it is applicable in any position in a text. We thus observe largely different patterns of occurrence and usage of head beats on the one hand and eyebrow beats on the other, suggesting that the two represent two separate modalities of visual prominence cuing.

sted, utgiver, år, opplag, sider
Elsevier B.V., 2017
Emneord
Degrees of freedom (mechanics), Common ground, Degree of freedom, Information structures, Multi-modal, Pitch accents, Swedishs, Continuous speech recognition
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-227120 (URN)10.1016/j.specom.2017.08.008 (DOI)000418973700008 ()2-s2.0-85034707910 (Scopus ID)
Merknad

QC 20180508

Tilgjengelig fra: 2018-05-08 Laget: 2018-05-08 Sist oppdatert: 2019-02-07bibliografisk kontrollert
Alexanderson, S., House, D. & Beskow, J. (2016). Automatic annotation of gestural units in spontaneous face-to-face interaction. In: MA3HMI 2016 - Proceedings of the Workshop on Multimodal Analyses Enabling Artificial Agents in Human-Machine Interaction: . Paper presented at 2016 Workshop on Multimodal Analyses Enabling Artificial Agents in Human-Machine Interaction, MA3HMI 2016, 12 November 2016 through 16 November 2016 (pp. 15-19).
Åpne denne publikasjonen i ny fane eller vindu >>Automatic annotation of gestural units in spontaneous face-to-face interaction
2016 (engelsk)Inngår i: MA3HMI 2016 - Proceedings of the Workshop on Multimodal Analyses Enabling Artificial Agents in Human-Machine Interaction, 2016, s. 15-19Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Speech and gesture co-occur in spontaneous dialogue in a highly complex fashion. There is a large variability in the motion that people exhibit during a dialogue, and different kinds of motion occur during different states of the interaction. A wide range of multimodal interface applications, for example in the fields of virtual agents or social robots, can be envisioned where it is important to be able to automatically identify gestures that carry information and discriminate them from other types of motion. While it is easy for a human to distinguish and segment manual gestures from a flow of multimodal information, the same task is not trivial to perform for a machine. In this paper we present a method to automatically segment and label gestural units from a stream of 3D motion capture data. The gestural flow is modeled with a 2-level Hierarchical Hidden Markov Model (HHMM) where the sub-states correspond to gesture phases. The model is trained based on labels of complete gesture units and self-adaptive manipulators. The model is tested and validated on two datasets differing in genre and in method of capturing motion, and outperforms a state-of-the-art SVM classifier on a publicly available dataset.

Emneord
Gesture recognition, Motion capture, Spontaneous dialogue, Hidden Markov models, Man machine systems, Markov processes, Online systems, 3D motion capture, Automatic annotation, Face-to-face interaction, Hierarchical hidden markov models, Multi-modal information, Multi-modal interfaces, Classification (of information)
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-202135 (URN)10.1145/3011263.3011268 (DOI)2-s2.0-85003571594 (Scopus ID)9781450345620 (ISBN)
Konferanse
2016 Workshop on Multimodal Analyses Enabling Artificial Agents in Human-Machine Interaction, MA3HMI 2016, 12 November 2016 through 16 November 2016
Forskningsfinansiär
Swedish Research Council, 2010-4646
Merknad

Funding text: The work reported here is carried out within the projects: "Timing of intonation and gestures in spoken communication," (P12-0634:1) funded by the Bank of Sweden Tercentenary Foundation, and "Large-scale massively multimodal modelling of non-verbal behaviour in spontaneous dialogue," (VR 2010-4646) funded by Swedish Research Council.

Tilgjengelig fra: 2017-03-13 Laget: 2017-03-13 Sist oppdatert: 2017-11-24bibliografisk kontrollert
Zellers, M., House, D. & Alexanderson, S. (2016). Prosody and hand gesture at turn boundaries in Swedish. In: Proceedings of the International Conference on Speech Prosody: . Paper presented at 8th Speech Prosody 2016, 31 May 2016 through 3 June 2016 (pp. 831-835). International Speech Communications Association
Åpne denne publikasjonen i ny fane eller vindu >>Prosody and hand gesture at turn boundaries in Swedish
2016 (engelsk)Inngår i: Proceedings of the International Conference on Speech Prosody, International Speech Communications Association , 2016, s. 831-835Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

In order to ensure smooth turn-taking between conversational participants, interlocutors must have ways of providing information to one another about whether they have finished speaking or intend to continue. The current work investigates Swedish speakers’ use of hand gestures in conjunction with turn change or turn hold in unrestricted, spontaneous speech. As has been reported by other researchers, we find that speakers’ gestures end before the end of speech in cases of turn change, while they may extend well beyond the end of a given speech chunk in the case of turn hold. We investigate the degree to which prosodic cues and gesture cues to turn transition in Swedish face-to-face conversation are complementary versus functioning additively. The co-occurrence of acoustic prosodic features and gesture at potential turn boundaries gives strong support for considering hand gestures as part of the prosodic system, particularly in the context of discourse-level information such as maintaining smooth turn transition.

sted, utgiver, år, opplag, sider
International Speech Communications Association, 2016
Emneord
Gesture, Multimodal communication, Swedish, Turn transition, Co-occurrence, Face-to-face conversation, Multimodal communications, Prosodic features, Smooth turn-taking, Spontaneous speech, Swedishs, Speech
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-195492 (URN)2-s2.0-84982980451 (Scopus ID)
Konferanse
8th Speech Prosody 2016, 31 May 2016 through 3 June 2016
Merknad

QC 20161125

Tilgjengelig fra: 2016-11-25 Laget: 2016-11-03 Sist oppdatert: 2018-01-13bibliografisk kontrollert
Strömbergsson, S., Salvi, G. & House, D. (2015). Acoustic and perceptual evaluation of category goodness of /t/ and /k/ in typical and misarticulated children's speech. Journal of the Acoustical Society of America, 137(6), 3422-3435
Åpne denne publikasjonen i ny fane eller vindu >>Acoustic and perceptual evaluation of category goodness of /t/ and /k/ in typical and misarticulated children's speech
2015 (engelsk)Inngår i: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 137, nr 6, s. 3422-3435Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

This investigation explores perceptual and acoustic characteristics of children's successful and unsuccessful productions of /t/ and /k/, with a specific aim of exploring perceptual sensitivity to phonetic detail, and the extent to which this sensitivity is reflected in the acoustic domain. Recordings were collected from 4- to 8-year-old children with a speech sound disorder (SSD) who misarticulated one of the target plosives, and compared to productions recorded from peers with typical speech development (TD). Perceptual responses were registered with regards to a visual-analog scale, ranging from "clear [t]" to "clear [k]." Statistical models of prototypical productions were built, based on spectral moments and discrete cosine transform features, and used in the scoring of SSD productions. In the perceptual evaluation, " clear substitutions" were rated as less prototypical than correct productions. Moreover, target-appropriate productions of /t/ and /k/ produced by children with SSD were rated as less prototypical than those produced by TD peers. The acoustical modeling could to a large extent discriminate between the gross categories /t/ and /k/, and scored the SSD utterances on a continuous scale that was largely consistent with the category of production. However, none of the methods exhibited the same sensitivity to phonetic detail as the human listeners.

HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-171155 (URN)10.1121/1.4921033 (DOI)000356622400057 ()26093431 (PubMedID)2-s2.0-84935019965 (Scopus ID)
Merknad

QC 20150720

Tilgjengelig fra: 2015-07-20 Laget: 2015-07-20 Sist oppdatert: 2017-12-04bibliografisk kontrollert
Artman, H., House, D. & Hulten, M. (2015). Designed by Engineers: An analysis of interactionaries with engineering students. Designs for Learning, 7(2), 28-56, Article ID 10.2478/dfl-2014-0062.
Åpne denne publikasjonen i ny fane eller vindu >>Designed by Engineers: An analysis of interactionaries with engineering students
2015 (engelsk)Inngår i: Designs for Learning, ISSN 1654-7608, Vol. 7, nr 2, s. 28-56, artikkel-id 10.2478/dfl-2014-0062Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

The aim of this study is to describe and analyze learning taking place in a collaborative design exercise involving engineering students. The students perform a time-constrained, open-ended, complex interaction design task, an “interactionary”. A multimodal learning perspective is used. We have performed detailed analyses of video recordings of the engineering students, including classifying aspects of interaction. Our results show that the engineering students carry out and articulate their design work using a technology-centred approach and focus more on the function of their designs than on aspects of interaction. The engineering students mainly make use of ephemeral communication strategies (gestures and speech) rather than sketching in physical materials. We conclude that the interactionary may be an educational format that can help engineering students learn the messiness of design work. We further identify several constraints to the engineering students’ design learning and propose useful interventions that a teacher could make during an interactionary. We especially emphasize interventions that help engineering students retain aspects of human-centered design throughout the design process. This study partially replicates a previous study which involved interaction design students.

sted, utgiver, år, opplag, sider
De Gruyter Open, 2015
Emneord
design, engineering education, interactionary, interaction design, learning design sequence, multimodal learning
HSV kategori
Forskningsprogram
Människa-datorinteraktion
Identifikatorer
urn:nbn:se:kth:diva-164951 (URN)10.2478/dfl-2014-0062 (DOI)
Merknad

QC 20150424

Tilgjengelig fra: 2015-04-21 Laget: 2015-04-21 Sist oppdatert: 2017-12-04bibliografisk kontrollert
Ambrazaitis, G., Svensson Lundmark, M. & House, D. (2015). Head beats and eyebrow movements as a function of phonological prominence levels and word accents in Stockholm Swedish news broadcasts. In: The 3rd European Symposium on Multimodal Communication: . Paper presented at The 3rd European Symposium on Multimodal Communication, Dublin on 17, 18 September 2015. Dublin, Ireland
Åpne denne publikasjonen i ny fane eller vindu >>Head beats and eyebrow movements as a function of phonological prominence levels and word accents in Stockholm Swedish news broadcasts
2015 (engelsk)Inngår i: The 3rd European Symposium on Multimodal Communication, Dublin, Ireland, 2015Konferansepaper, Publicerat paper (Fagfellevurdert)
sted, utgiver, år, opplag, sider
Dublin, Ireland: , 2015
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-180421 (URN)
Konferanse
The 3rd European Symposium on Multimodal Communication, Dublin on 17, 18 September 2015
Merknad

QC 20160216

Tilgjengelig fra: 2016-01-13 Laget: 2016-01-13 Sist oppdatert: 2018-01-10bibliografisk kontrollert
Ambrazaitis, G., Svensson Lundmark, M. & House, D. (2015). Head Movements, Eyebrows, and Phonological Prosodic Prominence Levels in Stockholm. In: 13th International Conference on Auditory-Visual Speech Processing (AVSP 2015): . Paper presented at 13th International Conference on Auditory-Visual Speech Processing (AVSP 2015) (pp. 42). Vienna, Austria
Åpne denne publikasjonen i ny fane eller vindu >>Head Movements, Eyebrows, and Phonological Prosodic Prominence Levels in Stockholm
2015 (engelsk)Inngår i: 13th International Conference on Auditory-Visual Speech Processing (AVSP 2015), Vienna, Austria, 2015, s. 42-Konferansepaper, Publicerat paper (Fagfellevurdert)
sted, utgiver, år, opplag, sider
Vienna, Austria: , 2015
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-180420 (URN)
Konferanse
13th International Conference on Auditory-Visual Speech Processing (AVSP 2015)
Merknad

QC 20160216

Tilgjengelig fra: 2016-01-13 Laget: 2016-01-13 Sist oppdatert: 2018-01-10bibliografisk kontrollert
Ambrazaitis, G., Svensson Lundmark, M. & House, D. (2015). Multimodal levels of promincence: a preliminary analysis of head and eyebrow movements in Swedish news broadcasts. In: Lundmark Svensson, M.; Ambrazaitis, G.; van de Weijer, J. (Ed.), Proceedings of Fonetik 2015: . Paper presented at Fonetik 2015, Lund (pp. 11-16). Lund
Åpne denne publikasjonen i ny fane eller vindu >>Multimodal levels of promincence: a preliminary analysis of head and eyebrow movements in Swedish news broadcasts
2015 (engelsk)Inngår i: Proceedings of Fonetik 2015 / [ed] Lundmark Svensson, M.; Ambrazaitis, G.; van de Weijer, J., Lund, 2015, s. 11-16Konferansepaper, Publicerat paper (Annet vitenskapelig)
sted, utgiver, år, opplag, sider
Lund: , 2015
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-180416 (URN)
Konferanse
Fonetik 2015, Lund
Merknad

QC 20160216

Tilgjengelig fra: 2016-01-13 Laget: 2016-01-13 Sist oppdatert: 2018-01-10bibliografisk kontrollert
House, D., Alexanderson, S. & Beskow, J. (2015). On the temporal domain of co-speech gestures: syllable, phrase or talk spurt?. In: Lundmark Svensson, M.; Ambrazaitis, G.; van de Weijer, J. (Ed.), Proceedings of Fonetik 2015: . Paper presented at Fonetik 2015, Lund (pp. 63-68).
Åpne denne publikasjonen i ny fane eller vindu >>On the temporal domain of co-speech gestures: syllable, phrase or talk spurt?
2015 (engelsk)Inngår i: Proceedings of Fonetik 2015 / [ed] Lundmark Svensson, M.; Ambrazaitis, G.; van de Weijer, J., 2015, s. 63-68Konferansepaper, Publicerat paper (Annet vitenskapelig)
Abstract [en]

This study explores the use of automatic methods to detect and extract handgesture movement co-occuring with speech. Two spontaneous dyadic dialogueswere analyzed using 3D motion-capture techniques to track hand movement.Automatic speech/non-speech detection was performed on the dialogues resultingin a series of connected talk spurts for each speaker. Temporal synchrony of onsetand offset of gesture and speech was studied between the automatic hand gesturetracking and talk spurts, and compared to an earlier study of head nods andsyllable synchronization. The results indicated onset synchronization between headnods and the syllable in the short temporal domain and between the onset of longergesture units and the talk spurt in a more extended temporal domain.

HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-180407 (URN)
Konferanse
Fonetik 2015, Lund
Merknad

QC 20160216

Tilgjengelig fra: 2016-01-13 Laget: 2016-01-13 Sist oppdatert: 2018-01-10bibliografisk kontrollert
Zellers, M. & House, D. (2015). Parallels between hand gestures and acoustic prosodic features in turn-taking. In: 14th International Pragmatics Conference: . Paper presented at 14th International Pragmatics Conference (pp. 454-455). Antwerp, Belgium
Åpne denne publikasjonen i ny fane eller vindu >>Parallels between hand gestures and acoustic prosodic features in turn-taking
2015 (engelsk)Inngår i: 14th International Pragmatics Conference, Antwerp, Belgium, 2015, s. 454-455Konferansepaper, Publicerat paper (Fagfellevurdert)
sted, utgiver, år, opplag, sider
Antwerp, Belgium: , 2015
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-180418 (URN)
Konferanse
14th International Pragmatics Conference
Merknad

tmh_import_16_01_13, tmh_id_4024

QC 2016-02-18

Tilgjengelig fra: 2016-01-13 Laget: 2016-01-13 Sist oppdatert: 2018-01-10bibliografisk kontrollert
Organisasjoner
Identifikatorer
ORCID-id: ORCID iD iconorcid.org/0000-0002-4628-3769