Change search
ReferencesLink to record
Permanent link

Direct link
A method for the detection of communicative head nods in expressive speech
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Speech Communication and Technology.
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Speech Communication and Technology.
2006 (English)In: Papers from the Second Nordic Conference on Multimodal Communication 2005 / [ed] Allwood, J.; Dorriots, B.; Nicholson, S., Göteborg: Göteborg University , 2006, 153-165 p.Conference paper (Refereed)
Abstract [en]

The aim of this study is to propose a method for automatic detection of head nods during the production of semi-spontaneous speech. This method also provides means for extracting certain characteristics of head nods, that may vary depending on placement, function and even underlying emotional expression. The material used is part of the Swedish PF-Star corpora which were recorded by means of an optical motion capture system (Qualisys) able to successfully register articulatory movements as well as head movements and facial expressions. The material consists of short sentences as well as of dialogic speech produced by a Swedish actor. The method for automatic head nods detection on the 3D data acquired with Qualisys is based on criteria for slope, amplitude and a minimum number of consecutive frames. The criteria are tuned on head nods that have been manually annotated. These parameters can be varied to detect different kinds of head movements and can also be combined with other parameters in order to detect facial gestures, such as eyebrow displacements. For this study we focused in particular on the detection of head nods, since in earlier studies they have been found to be important visual cues in particular for signaling feedback and focus. In order to evaluate the method a preliminary test was run on semi-spontaneous dialogic speech, which is also part of the Swedish PF-Star corpora and produced by the same actor who read the sentences. The results show that the parameters and the criteria that had been set on the basis of the training corpus are valid also for the dialogic speech, even if more sophisticated parameters could be useful to achieve a more precise result.

Place, publisher, year, edition, pages
Göteborg: Göteborg University , 2006. 153-165 p.
, Gothenburg papers in theoretical linguistics, ISSN 0349-1021 ; 92
Keyword [en]
Multimodality, Swedish, Automatic recognition, Gestural communication, Computational linguistics
National Category
Computer Science Language Technology (Computational Linguistics)
URN: urn:nbn:se:kth:diva-51930OAI: diva2:465224
The Second Nordic Conference on Multimodal Communication, Göteborg, 07/04/2005
tmh_import_11_12_14. QC 20120103Available from: 2011-12-14 Created: 2011-12-14 Last updated: 2012-01-03Bibliographically approved

Open Access in DiVA

No full text

Search in DiVA

By author/editor
Cerrato, LoredanaSvanfeldt, Gunilla
By organisation
Speech Communication and Technology
Computer ScienceLanguage Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 10 hits
ReferencesLink to record
Permanent link

Direct link