Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Prosodic characteristics of English-accented Swedish neural TTS
KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Tal, musik och hörsel, TMH, Tal-kommunikation. Swedish Agency for Accessible Media.ORCID-id: 0000-0002-9659-1532
KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Tal, musik och hörsel, TMH.ORCID-id: 0000-0003-2598-6868
KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Tal, musik och hörsel, TMH.ORCID-id: 0000-0002-4628-3769
KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Tal, musik och hörsel, TMH.ORCID-id: 0000-0001-9327-9482
Vise andre og tillknytning
2024 (engelsk)Inngår i: Proceedings of Speech Prosody 2024, Leiden, The Netherlands: International Speech Communication Association , 2024, s. 1035-1039Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Neural text-to-speech synthesis (TTS) captures prosodicfeatures strikingly well, notwithstanding the lack of prosodiclabels in training or synthesis. We trained a voice on a singleSwedish speaker reading in Swedish and English. The resultingTTS allows us to control the degree of English-accentedness inSwedish sentences. English-accented Swedish commonlyexhibits well-known prosodic characteristics such as erroneoustonal accents and understated or missed durational differences.TTS quality was verified in three ways. Automatic speechrecognition resulted in low errors, verifying intelligibility.Automatic language classification had Swedish as the majoritychoice, while the likelihood of English increased with ourtargeted degree of English-accentedness. Finally, a rank ofperceived English-accentedness acquired through pairwisecomparisons by 20 human listeners demonstrated a strongcorrelation with the targeted English-accentedness.We report on phonetic and prosodic analyses of theaccented TTS. In addition to the anticipated segmentaldifferences, the analyses revealed temporal and prominencerelated variations coherent with Swedish spoken by Englishspeakers, such as missing Swedish stress patterns and overlyreduced unstressed syllables. With this work, we aim to gleaninsights into speech prosody from the latent prosodic featuresof neural TTS models. In addition, it will help implementspeech phenomena such as code switching in TTS

sted, utgiver, år, opplag, sider
Leiden, The Netherlands: International Speech Communication Association , 2024. s. 1035-1039
Emneord [en]
foreign-accented text-to-speech synthesis, neural text-to-speech synthesis, latent prosodic features
HSV kategori
Forskningsprogram
Tal- och musikkommunikation
Identifikatorer
URN: urn:nbn:se:kth:diva-349946DOI: 10.21437/SpeechProsody.2024-209Scopus ID: 2-s2.0-105008058763OAI: oai:DiVA.org:kth-349946DiVA, id: diva2:1881737
Konferanse
Speech Prosody 2024, Leiden, The Netherlands, 2-5 July 2024
Prosjekter
Deep learning based speech synthesis for reading aloud of lengthy and information rich texts in Swedish (2018-02427)Språkbanken Tal (2017-00626)
Forskningsfinansiär
Vinnova, (2018-02427
Merknad

QC 20240705

Tilgjengelig fra: 2024-07-03 Laget: 2024-07-03 Sist oppdatert: 2025-07-01bibliografisk kontrollert

Open Access i DiVA

fulltext(511 kB)244 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 511 kBChecksum SHA-512
ae43bef131ad676c45e4124abfa4ad2e6ec674173781b55331238af708e03d70e27412abac92d0f0673d69d1739156524196cbcf54e6b54d8394478034c85438
Type fulltextMimetype application/pdf

Andre lenker

Forlagets fulltekstScopusPdf

Person

Tånnander, ChristinaO'Regan, JimHouse, DavidEdlund, JensBeskow, Jonas

Søk i DiVA

Av forfatter/redaktør
Tånnander, ChristinaO'Regan, JimHouse, DavidEdlund, JensBeskow, Jonas
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 245 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

doi
urn-nbn

Altmetric

doi
urn-nbn
Totalt: 564 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf