Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Generation of speech and facial animation with controllable articulatory effort for amusing conversational characters
KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Tal, musik och hörsel, TMH.ORCID-id: 0000-0002-0397-6442
KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Tal, musik och hörsel, TMH.ORCID-id: 0000-0003-1175-840X
KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Tal, musik och hörsel, TMH.ORCID-id: 0000-0003-1399-6604
2023 (engelsk)Inngår i: 23rd ACM International Conference on Interlligent Virtual Agent (IVA 2023), Institute of Electrical and Electronics Engineers (IEEE) , 2023Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Engaging embodied conversational agents need to generate expressive behavior in order to be believable insocializing interactions. We present a system that can generate spontaneous speech with supporting lip movements. The neural conversational TTSvoice is trained on a multi-style speech corpus that has been prosodically tagged (pitch and speaking rate) and transcribed (including tokens for breathing, fillers and laughter). We introduce a speech animation algorithm where articulatory effort can be adjusted. The facial animation is driven by time-stamped phonemes and prominence estimates from the synthesised speech waveform to modulate the lip and jaw movements accordingly. In objective evaluations we show that the system is able to generate speech and facial animation that vary in articulation effort. In subjective evaluations we compare our conversational TTS system’s capability to deliver jokes with a commercial TTS. Both systems succeeded equally good.

sted, utgiver, år, opplag, sider
Institute of Electrical and Electronics Engineers (IEEE) , 2023.
HSV kategori
Identifikatorer
URN: urn:nbn:se:kth:diva-341039DOI: 10.1145/3570945.3607289Scopus ID: 2-s2.0-85183581153OAI: oai:DiVA.org:kth-341039DiVA, id: diva2:1820903
Konferanse
23rd ACM International Conference on Intelligent Virtual Agent (IVA 2023), Würzburg, Germany, Jan 5 2023 - Jan 8 2023
Merknad

Part of ISBN 9798350345445

QC 20231124

Tilgjengelig fra: 2023-12-19 Laget: 2023-12-19 Sist oppdatert: 2024-02-09bibliografisk kontrollert

Open Access i DiVA

fulltext(10059 kB)95 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 10059 kBChecksum SHA-512
57413af67560250a143cb519cad54592d14d91be581ae656f66ea3c52861db2833a9dce56859d1bfdaf66e8e7ffa82e09741556caedd1718f353c7ba1795dc3f
Type fulltextMimetype application/pdf

Andre lenker

Forlagets fulltekstScopus

Person

Gustafsson, JoakimSzékely, ÉvaBeskow, Jonas

Søk i DiVA

Av forfatter/redaktør
Gustafsson, JoakimSzékely, ÉvaBeskow, Jonas
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 96 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

doi
urn-nbn

Altmetric

doi
urn-nbn
Totalt: 171 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf