kth.sePublications KTH
Operational message
There are currently operational disruptions. Troubleshooting is in progress.
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
VoiceQualityVC: A Voice Conversion System for Studying the Perceptual Effects of Voice Quality in Speech
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.ORCID iD: 0000-0001-9537-8505
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.ORCID iD: 0000-0002-0397-6442
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.ORCID iD: 0000-0003-1175-840X
2025 (English)In: Interspeech 2025, International Speech Communication Association , 2025, p. 2295-2299Conference paper, Published paper (Refereed)
Abstract [en]

Voice quality is an often overlooked aspect of speech with many communicative functions. Voice quality conveys both paralinguistic and pragmatic information, such as signalling speaker stance and aids in grounding. In this paper, we present VoiceQualityVC, a tool that can manipulate the voice quality of both natural and synthesized speech using voice quality features including CPPS, H1-H2, and H1-A3. VoiceQualityVC is a research tool for perceptual experiments into voice quality and UX experiments for voice design. We perform an objective evaluation demonstrating the control of these features as well as subjective listening tests of the paralinguistic attributes of intimacy, valence, and investment. In these listening tests breathy voice was rated as more intimate and more invested than modal voice and creaky voice was rated as less intimate and less positive. The code and models can be found at https://github.com/Hfkml/VQVC.

Place, publisher, year, edition, pages
International Speech Communication Association , 2025. p. 2295-2299
Keywords [en]
Paralinguistics, Pragmatics, Voice conversion, Voice quality
National Category
Comparative Language Studies and Linguistics Natural Language Processing
Identifiers
URN: urn:nbn:se:kth:diva-372784DOI: 10.21437/Interspeech.2025-902Scopus ID: 2-s2.0-105020036268OAI: oai:DiVA.org:kth-372784DiVA, id: diva2:2015306
Conference
26th Interspeech Conference 2025, Rotterdam, Netherlands, Kingdom of the, August 17-21, 2025
Note

QC 20251120

Available from: 2025-11-20 Created: 2025-11-20 Last updated: 2025-11-20Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Lameris, HarmGustafsson, JoakimSzékely, Éva

Search in DiVA

By author/editor
Lameris, HarmGustafsson, JoakimSzékely, Éva
By organisation
Speech, Music and Hearing, TMH
Comparative Language Studies and LinguisticsNatural Language Processing

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 31 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf