kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Listener sensitivity to deviating obstruents in WaveNet
Sigmedia Lab, ADAPT Centre, School of Engineering, Trinity College Dublin, Ireland.
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.ORCID iD: 0000-0001-9327-9482
Sigmedia Lab, ADAPT Centre, School of Engineering, Trinity College Dublin, Ireland.
Sigmedia Lab, ADAPT Centre, School of Engineering, Trinity College Dublin, Ireland.
2023 (English)In: Interspeech 2023, International Speech Communication Association , 2023, p. 1080-1084Conference paper, Published paper (Refereed)
Abstract [en]

This paper investigates the perceptual significance of the deviation in obstruents previously observed in WaveNet vocoders. The study involved presenting stimuli of varying lengths to 128 participants, who were asked to identify whether each stimulus was produced by a human or a machine. The participants' responses were captured using a 2-alternative forced choice task. The study found that while the length of the stimuli did not reliably affect participants' accuracy in the task, the concentration of obstruents did have a significant effect. Participants were consistently more accurate in identifying WaveNet stimuli as machine when the phrases were obstruent-rich. These findings show that the deviation in obstruents reported in WaveNet voices is perceivable by human listeners. The test protocol may be of wider utility in TTS.

Place, publisher, year, edition, pages
International Speech Communication Association , 2023. p. 1080-1084
Keywords [en]
distortion, obstruents, perception, TTS evaluation, WaveNet
National Category
Psychology (excluding Applied Psychology)
Identifiers
URN: urn:nbn:se:kth:diva-337831DOI: 10.21437/Interspeech.2023-1843ISI: 001186650301047Scopus ID: 2-s2.0-85171585188OAI: oai:DiVA.org:kth-337831DiVA, id: diva2:1803491
Conference
24th International Speech Communication Association, Interspeech 2023, August 20-24, 2023, Dublin, Ireland
Note

QC 20241015

Available from: 2023-10-09 Created: 2023-10-09 Last updated: 2024-10-15Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Edlund, Jens

Search in DiVA

By author/editor
Edlund, Jens
By organisation
Speech, Music and Hearing, TMH
Psychology (excluding Applied Psychology)

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 87 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf