Listener sensitivity to deviating obstruents in WaveNet
2023 (English)In: Interspeech 2023, International Speech Communication Association , 2023, p. 1080-1084Conference paper, Published paper (Refereed)
Abstract [en]
This paper investigates the perceptual significance of the deviation in obstruents previously observed in WaveNet vocoders. The study involved presenting stimuli of varying lengths to 128 participants, who were asked to identify whether each stimulus was produced by a human or a machine. The participants' responses were captured using a 2-alternative forced choice task. The study found that while the length of the stimuli did not reliably affect participants' accuracy in the task, the concentration of obstruents did have a significant effect. Participants were consistently more accurate in identifying WaveNet stimuli as machine when the phrases were obstruent-rich. These findings show that the deviation in obstruents reported in WaveNet voices is perceivable by human listeners. The test protocol may be of wider utility in TTS.
Place, publisher, year, edition, pages
International Speech Communication Association , 2023. p. 1080-1084
Keywords [en]
distortion, obstruents, perception, TTS evaluation, WaveNet
National Category
Psychology (excluding Applied Psychology)
Identifiers
URN: urn:nbn:se:kth:diva-337831DOI: 10.21437/Interspeech.2023-1843ISI: 001186650301047Scopus ID: 2-s2.0-85171585188OAI: oai:DiVA.org:kth-337831DiVA, id: diva2:1803491
Conference
24th International Speech Communication Association, Interspeech 2023, August 20-24, 2023, Dublin, Ireland
Note
QC 20241015
2023-10-092023-10-092024-10-15Bibliographically approved