Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Data-driven synthesis of expressive visual speech using an MPEG-4 talking head
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.ORCID iD: 0000-0003-1399-6604
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.
2005 (English)In: 9th European Conference on Speech Communication and Technology, Lisbon, 2005, 793-796 p.Conference paper, Published paper (Refereed)
Abstract [en]

This paper describes initial experiments with synthesis of visual speech articulation for different emotions, using a newly developed MPEG-4 compatible talking head. The basic problem with combining speech and emotion in a talking head is to handle the interaction between emotional expression and articulation in the orofacial region. Rather than trying to model speech and emotion as two separate properties, the strategy taken here is to incorporate emotional expression in the articulation from the beginning. We use a data-driven approach, training the system to recreate the expressive articulation produced by an actor while portraying different emotions. Each emotion is modelled separately using principal component analysis and a parametric coarticulation model. The results so far are encouraging but more work is needed to improve naturalness and accuracy of the synthesized speech.

Place, publisher, year, edition, pages
Lisbon, 2005. 793-796 p.
National Category
Computer Science Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:kth:diva-51886Scopus ID: 2-s2.0-33745218748OAI: oai:DiVA.org:kth-51886DiVA: diva2:465180
Conference
9th European Conference on Speech Communication and Technology; Lisbon; 4 September 2005 through 8 September 2005
Note
QC 20120313Available from: 2011-12-14 Created: 2011-12-14 Last updated: 2012-03-13Bibliographically approved

Open Access in DiVA

No full text

Scopus

Authority records BETA

Beskow, Jonas

Search in DiVA

By author/editor
Beskow, JonasNordenberg, Mikael
By organisation
Speech, Music and Hearing, TMH
Computer ScienceLanguage Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 37 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf