Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Measurements of articulatory variation in expressive speech for set of Swedish vowels
KTH, Superseded Departments, Speech, Music and Hearing.
KTH, Superseded Departments, Speech, Music and Hearing.
KTH, Superseded Departments, Speech, Music and Hearing.
KTH, Superseded Departments, Speech, Music and Hearing.ORCID iD: 0000-0002-4628-3769
2004 (English)In: Speech Communication, ISSN 0167-6393, Vol. 44, no 1-4, 187-196 p.Article in journal (Refereed) Published
Abstract [en]

Facial gestures are used to convey e.g. emotions, dialogue states and conversational signals, which support us in the interpretation of other people's feelings and intentions. Synthesising this behaviour with an animated talking head would widen the possibilities of this intuitive interface. The dynamic characteristics of these facial gestures during speech affect articulation. Previously, articulation for neutral speech has been studied and implemented in animation rules. The results obtained in this study show how some articulatory parameters are affected by the influence of expressiveness in speech for a selection of Swedish vowels. Our focus has primarily been on attitudes and emotions conveying information that is intended to make an animated agent more "human-like". A multimodal corpus of acted expressive speech has been collected for this purpose.

Place, publisher, year, edition, pages
2004. Vol. 44, no 1-4, 187-196 p.
Keyword [en]
talking heads, expressive speech, facial gestures, articulation
National Category
Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:kth:diva-6512DOI: 10.1016/j.specom.2004.09.003ISI: 000226074500015Scopus ID: 2-s2.0-10444267340OAI: oai:DiVA.org:kth-6512DiVA: diva2:11246
Note
QC 20101126 QC 20110922. Workshop on Audio-Visual Speech Processing. St Jorioz, FRANCE. 2003 Available from: 2006-12-06 Created: 2006-12-06 Last updated: 2012-03-22Bibliographically approved
In thesis
1. Expressiveness in virtual talking faces
Open this publication in new window or tab >>Expressiveness in virtual talking faces
2006 (English)Licentiate thesis, comprehensive summary (Other scientific)
Abstract [en]

In this thesis, different aspects concerning how to make synthetic talking faces more expressive have been studied. How can we collect data for the studies, how is the lip articulation affected by expressive speech, can the recorded data be used interchangeably in different face models, can we use eye movements in the agent for communicative purposes? The work of this thesis includes studies of these questions and also an experiment using a talking head as a complement to a targeted audio device, in order to increase the intelligibility of the speech.

The data collection described in the first paper resulted in two multimodal speech corpora. In the following analysis of the recorded data it could be stated that expressive modes strongly affect the speech articulation, although further studies are needed in order to acquire more quantitative results and to cover more phonemes and expressions as well as to be able to generalise the results to more than one individual.

When switching the files containing facial animation parameters (FAPs) between different face models (as well as research sites), some problematic issues were encountered despite the fact that both face models were created according to the MPEG-4 standard. The evaluation test of the implemented emotional expressions showed that best recognition results were obtained when the face model and FAP-file originated from the same site.

The perception experiment where a synthetic talking head was combined with a targeted audio, parametric loudspeaker showed that the virtual face augmented the intelligibility of speech, especially when the sound beam was directed slightly to the side of the listener i. e. at lower sound intesities.

In the experiment with eye gaze in a virtual talking head, the possibility of achieving mutual gaze with the observer was assessed. The results indicated that it is possible, but also pointed at some design features in the face model that need to be altered in order to achieve a better control of the perceived gaze direction.

Place, publisher, year, edition, pages
Stockholm: KTH, 2006. 23 p.
Series
Trita-CSC-A, ISSN 1653-5723 ; 2006:28
National Category
Language Technology (Computational Linguistics)
Identifiers
urn:nbn:se:kth:diva-4210 (URN)978-91-7178-530-5 (ISBN)
Presentation
2006-12-18, Fantum, KTH, Lindstedtsvägen 24 plan 5, Stockholm, 15:00
Opponent
Supervisors
Note
QC 20101126Available from: 2006-12-06 Created: 2006-12-06 Last updated: 2010-11-26Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textScopus

Authority records BETA

House, David

Search in DiVA

By author/editor
Nordstrand, MagnusSvanfeldt, GunillaGranström, BjörnHouse, David
By organisation
Speech, Music and Hearing
In the same journal
Speech Communication
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 91 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf