Change search
ReferencesLink to record
Permanent link

Direct link
Predicting synthetic voice style from facial expressions. An application for augmented conversations
Show others and affiliations
2014 (English)In: Speech Communication, ISSN 0167-6393, E-ISSN 1872-7182, Vol. 57, 63-75 p.Article in journal (Refereed) PublishedText
Abstract [en]

The ability to efficiently facilitate social interaction and emotional expression is an important, yet unmet requirement for speech generating devices aimed at individuals with speech impairment. Using gestures such as facial expressions to control aspects of expressive synthetic speech could contribute to an improved communication experience for both the user of the device and the conversation partner. For this purpose, a mapping model between facial expressions and speech is needed, that is high level (utterance-based), versatile and personalisable. In the mapping developed in this work, visual and auditory modalities are connected based on the intended emotional salience of a message: the intensity of facial expressions of the user to the emotional intensity of the synthetic speech. The mapping model has been implemented in a system called WinkTalk that uses estimated facial expression categories and their intensity values to automat- ically select between three expressive synthetic voices reflecting three degrees of emotional intensity. An evaluation is conducted through an interactive experiment using simulated augmented conversations. The results have shown that automatic control of synthetic speech through facial expressions is fast, non-intrusive, sufficiently accurate and supports the user to feel more involved in the conversation. It can be concluded that the system has the potential to facilitate a more efficient communication process between user and listener. 

Place, publisher, year, edition, pages
2014. Vol. 57, 63-75 p.
National Category
Engineering and Technology
Identifiers
URN: urn:nbn:se:kth:diva-185529DOI: 10.1016/j.specom.2013.09.003ISI: 000328180100005ScopusID: 2-s2.0-84885457827OAI: oai:DiVA.org:kth-185529DiVA: diva2:922519
Note

QC 20160426

Available from: 2016-04-22 Created: 2016-04-21 Last updated: 2016-04-26Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Székely, Éva
In the same journal
Speech Communication
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 7 hits
ReferencesLink to record
Permanent link

Direct link