kth.sePublications KTH
Change search
Link to record
Permanent link

Direct link
Bokkahalli Satish, Shree HarshaORCID iD iconorcid.org/0009-0000-0554-7265
Publications (2 of 2) Show all publications
Bokkahalli Satish, S. H., Henter, G. E. & Székely, É. (2026). When Voice Matters: Evidence of Gender Disparity in Positional Bias of SpeechLLMs. In: Speech and Computer - 27th International Conference, SPECOM 2025, Proceedings: . Paper presented at 27th International Conference on Speech and Computer, SPECOM 2025, Szeged, Hungary, October 13-15, 2025 (pp. 25-38). Springer Nature
Open this publication in new window or tab >>When Voice Matters: Evidence of Gender Disparity in Positional Bias of SpeechLLMs
2026 (English)In: Speech and Computer - 27th International Conference, SPECOM 2025, Proceedings, Springer Nature , 2026, p. 25-38Conference paper, Published paper (Refereed)
Abstract [en]

The rapid development of SpeechLLM-based conversational AI systems has created a need for robustly benchmarking these efforts, including aspects of fairness and bias. At present, such benchmarks typically rely on multiple choice question answering (MCQA). In this paper, we present the first token-level probabilistic evaluation and response-based study of several issues affecting the use of MCQA in SpeechLLM benchmarking: 1) we examine how model temperature and prompt design affect gender and positional bias on an MCQA gender-bias benchmark; 2) we examine how these biases are affected by the gender of the input voice; and 3) we study to what extent observed trends carry over to a second gender-bias benchmark. Our results show that concerns about positional bias from the text domain are equally valid in the speech domain. We also find the effect to be stronger for female voices than for male voices. To our knowledge, this is the first study to isolate positional bias effects in SpeechLLM-based gender-bias benchmarks. We conclude that current MCQA benchmarks do not account for speech-based bias and alternative strategies are needed to ensure fairness towards all users.

Place, publisher, year, edition, pages
Springer Nature, 2026
Keywords
Benchmark robustness, Positional bias, SpeechLLMs
National Category
Natural Language Processing
Identifiers
urn:nbn:se:kth:diva-372782 (URN)10.1007/978-3-032-07956-5_2 (DOI)2-s2.0-105020237079 (Scopus ID)
Conference
27th International Conference on Speech and Computer, SPECOM 2025, Szeged, Hungary, October 13-15, 2025
Note

Part of ISBN 9783032079558

QC 20251120

Available from: 2025-11-20 Created: 2025-11-20 Last updated: 2025-11-20Bibliographically approved
Bokkahalli Satish, S. H., Henter, G. E. & Székely, É. (2025). Hear Me Out: Interactive evaluation and bias discovery platform for speech-to-speech conversational AI. In: Interspeech 2025: . Paper presented at 26th Interspeech Conference 2025, Rotterdam, Netherlands, Kingdom of the, August 17-21, 2025 (pp. 2151-2152). International Speech Communication Association
Open this publication in new window or tab >>Hear Me Out: Interactive evaluation and bias discovery platform for speech-to-speech conversational AI
2025 (English)In: Interspeech 2025, International Speech Communication Association , 2025, p. 2151-2152Conference paper, Published paper (Refereed)
Abstract [en]

A new wave of speech foundation models is emerging, capable of processing spoken language directly from audio. These models promise more expressive and emotionally aware interactions by retaining prosodic information throughout conversations. 'Hear Me Out' evaluates their ability to preserve crucial vocal cues, enabling users to explore how variations in speaker characteristics and paralinguistic features influence AI responses. Through real-time voice conversion, users can ask a question and then re-ask it with a modified one, immediately observing differences in response tone, phrasing, and behavior. The system presents paired responses side by side, offering direct comparisons of AI interpretations of both the original and transformed voices, thereby highlighting potential biases. By creating inquiry into speaker modeling, contextual understanding, and fairness, this immersive experience encourages users to reflect on identity, voice, and also promote inclusive future research.

Place, publisher, year, edition, pages
International Speech Communication Association, 2025
Keywords
bias in conversational AI, speech-to-speech conversational AI, voice conversion
National Category
Natural Language Processing Human Computer Interaction Computer Sciences Comparative Language Studies and Linguistics
Identifiers
urn:nbn:se:kth:diva-372786 (URN)2-s2.0-105020052310 (Scopus ID)
Conference
26th Interspeech Conference 2025, Rotterdam, Netherlands, Kingdom of the, August 17-21, 2025
Note

QC 20251120

Available from: 2025-11-20 Created: 2025-11-20 Last updated: 2025-11-20Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0009-0000-0554-7265

Search in DiVA

Show all publications