A method for the detection of communicative head nods in expressive speech
2006 (English)In: Papers from the Second Nordic Conference on Multimodal Communication 2005 / [ed] Allwood, J.; Dorriots, B.; Nicholson, S., Göteborg: Göteborg University , 2006, 153-165 p.Conference paper (Refereed)
The aim of this study is to propose a method for automatic detection of head nods during the production of semi-spontaneous speech. This method also provides means for extracting certain characteristics of head nods, that may vary depending on placement, function and even underlying emotional expression. The material used is part of the Swedish PF-Star corpora which were recorded by means of an optical motion capture system (Qualisys) able to successfully register articulatory movements as well as head movements and facial expressions. The material consists of short sentences as well as of dialogic speech produced by a Swedish actor. The method for automatic head nods detection on the 3D data acquired with Qualisys is based on criteria for slope, amplitude and a minimum number of consecutive frames. The criteria are tuned on head nods that have been manually annotated. These parameters can be varied to detect different kinds of head movements and can also be combined with other parameters in order to detect facial gestures, such as eyebrow displacements. For this study we focused in particular on the detection of head nods, since in earlier studies they have been found to be important visual cues in particular for signaling feedback and focus. In order to evaluate the method a preliminary test was run on semi-spontaneous dialogic speech, which is also part of the Swedish PF-Star corpora and produced by the same actor who read the sentences. The results show that the parameters and the criteria that had been set on the basis of the training corpus are valid also for the dialogic speech, even if more sophisticated parameters could be useful to achieve a more precise result.
Place, publisher, year, edition, pages
Göteborg: Göteborg University , 2006. 153-165 p.
, Gothenburg papers in theoretical linguistics, ISSN 0349-1021 ; 92
Multimodality, Swedish, Automatic recognition, Gestural communication, Computational linguistics
Computer Science Language Technology (Computational Linguistics)
IdentifiersURN: urn:nbn:se:kth:diva-51930OAI: oai:DiVA.org:kth-51930DiVA: diva2:465224
The Second Nordic Conference on Multimodal Communication, Göteborg, 07/04/2005
tmh_import_11_12_14. QC 201201032011-12-142011-12-142012-01-03Bibliographically approved