Lip-reading: Furhat audio visual intelligibility of a back projected animated face
2012 (English)In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Springer Berlin/Heidelberg, 2012, 196-203 p.Conference paper (Refereed)
Back projecting a computer animated face, onto a three dimensional static physical model of a face, is a promising technology that is gaining ground as a solution to building situated, flexible and human-like robot heads. In this paper, we first briefly describe Furhat, a back projected robot head built for the purpose of multimodal multiparty human-machine interaction, and its benefits over virtual characters and robotic heads; and then motivate the need to investigating the contribution to speech intelligibility Furhat's face offers. We present an audio-visual speech intelligibility experiment, in which 10 subjects listened to short sentences with degraded speech signal. The experiment compares the gain in intelligibility between lip reading a face visualized on a 2D screen compared to a 3D back-projected face and from different viewing angles. The results show that the audio-visual speech intelligibility holds when the avatar is projected onto a static face model (in the case of Furhat), and even, rather surprisingly, exceeds it. This means that despite the movement limitations back projected animated face models bring about; their audio visual speech intelligibility is equal, or even higher, compared to the same models shown on flat displays. At the end of the paper we discuss several hypotheses on how to interpret the results, and motivate future investigations to better explore the characteristics of visual speech perception 3D projected faces.
Place, publisher, year, edition, pages
Springer Berlin/Heidelberg, 2012. 196-203 p.
, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), ISSN 0302-9743 ; 7502 LNAI
Furhat, Lip reading, Robot Heads, Talking Head, Visual Speech
Computer Science Language Technology (Computational Linguistics)
IdentifiersURN: urn:nbn:se:kth:diva-104969DOI: 10.1007/978-3-642-33197-8-20ScopusID: 2-s2.0-84867509147ISBN: 978-364233196-1OAI: oai:DiVA.org:kth-104969DiVA: diva2:567899
12th International Conference on Intelligent Virtual Agents, IVA 2012, 12 September 2012 through 14 September 2012, Santa Cruz, CA
FunderICT - The Next Generation
QC 201211142012-11-142012-11-142013-04-15Bibliographically approved