kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Can you tell if tongue movements are real or synthetic?
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Speech Communication and Technology. KTH, School of Computer Science and Communication (CSC), Centres, Centre for Speech Technology, CTT.ORCID iD: 0000-0003-4532-014X
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Speech Communication and Technology. KTH, School of Computer Science and Communication (CSC), Centres, Centre for Speech Technology, CTT.
2009 (English)In: Proceedings of AVSP, 2009Conference paper, Published paper (Refereed)
Abstract [en]

We have investigated if subjects are aware of what natural tongue movements look like, by showing them animations based on either measurements or rule-based synthesis. The issue is of interest since a previous audiovisual speech perception study recently showed that the word recognition rate in sentences with degraded audio was significantly better with real tongue movements than with synthesized. The subjects in the current study could as a group not tell which movements were real, with a classification score at chance level. About half of the subjects were significantly better at discriminating between the two types of animations, but their classification score was as often well below chance as above. The correlation between classification score and word recognition rate for subjects who also participated in the perception study was very weak, suggesting that the higher recognition score for real tongue movements may be due to subconscious, rather than conscious, processes. This finding could potentially be interpreted as an indication that audiovisual speech perception is based onarticulatory gestures.

Place, publisher, year, edition, pages
2009.
Keywords [en]
augmented reality, tongue reading, visual speech synthesis, data-driven animation
National Category
Computer Sciences Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:kth:diva-52102OAI: oai:DiVA.org:kth-52102DiVA, id: diva2:465397
Conference
International Conference on Auditory-Visual Speech Processing 2009, 10-13 September 2009, University of East Anglia, Norwich, UK
Note
tmh_import_11_12_14. QC 20111229Available from: 2011-12-14 Created: 2011-12-14 Last updated: 2022-06-24Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

http://www.speech.kth.se/prod/publications/files/3359.pdf

Search in DiVA

By author/editor
Engwall, OlovWik, Preben
By organisation
Speech Communication and TechnologyCentre for Speech Technology, CTT
Computer SciencesLanguage Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 268 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf