kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Lip Synchronization: from Phone Lattice to PCA Eigen-projections using Neural Networks
KTH, School of Computer Science and Communication (CSC), Centres, Centre for Speech Technology, CTT. KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Speech Communication and Technology.
2008 (English)In: INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, BAIXAS: ISCA-INST SPEECH COMMUNICATION ASSOC , 2008, p. 2016-2019Conference paper, Published paper (Refereed)
Abstract [en]

Lip synchronization is the process of generating natural lip movements from a speech signal. In this work we address the lip-sync problem using an automatic phone recognizer that generates a phone lattice carrying posterior probabilities. The acoustic feature vector contains the posterior probabilities of all the phones over a time window centered at the current time point. Hence this representation characterizes the phone recognition output including the confusion patterns caused by its limited accuracy. A 3D face model with varying texture is computed by analyzing a video recording of the speaker using a 3D morphable model. Training a neural network using 30 000 data vectors from an audiovisual recording in Dutch resulted in a very good simulation of the face on independent data sets of the same or of a different speaker.

Place, publisher, year, edition, pages
BAIXAS: ISCA-INST SPEECH COMMUNICATION ASSOC , 2008. p. 2016-2019
Keywords [en]
lip synchronization, speech recognition, phone lattice, 3D morphable models, principal component analysis, audio visual speech
National Category
Computer and Information Sciences General Language Studies and Linguistics
Identifiers
URN: urn:nbn:se:kth:diva-29854ISI: 000277026101077Scopus ID: 2-s2.0-84867204708ISBN: 978-1-61567-378-0 (print)OAI: oai:DiVA.org:kth-29854DiVA, id: diva2:399745
Conference
9th Annual Conference of the International-Speech-Communication-Association (INTERSPEECH 2008)
Note
QC 20110222Available from: 2011-02-23 Created: 2011-02-17 Last updated: 2022-06-25Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

ScopusISCA

Search in DiVA

By author/editor
Al Moubayed, Samer
By organisation
Centre for Speech Technology, CTTSpeech Communication and Technology
Computer and Information SciencesGeneral Language Studies and Linguistics

Search outside of DiVA

GoogleGoogle Scholar

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 1186 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf