kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Advancing User-Voice Interaction: Exploring Emotion-Aware Voice Assistants Through a Role-Swapping Approach
Univ Bergen, Bergen, Norway.
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0003-1804-6296
Univ Surrey, Guildford, Surrey, England.
Norwegian Univ Sci & Technol, Trondheim, Norway.
Show others and affiliations
2025 (English)In: Distributed, Ambient And Pervasive Interactions, Dapi 2025, Pt I / [ed] Konomi, S Streitz, NA, Springer Nature , 2025, Vol. 15802, p. 303-320Conference paper, Published paper (Refereed)
Abstract [en]

As voice assistants (VAs) become increasingly integrated into daily life, the need for emotion-aware systems that can recognize and respond appropriately to user emotions has grown. While significant progress has been made in speech emotion recognition (SER) and sentiment analysis, effectively addressing user emotions-particularly negative ones-remains a challenge. This study explores human emotional response strategies in VA interactions using a role-swapping approach, where participants regulate AI emotions rather than receiving pre-programmed responses. Through speech feature analysis and natural language processing (NLP), we examined acoustic and linguistic patterns across various emotional scenarios. Results show that participants favor neutral or positive emotional responses when engaging with negative emotional cues, highlighting a natural tendency toward emotional regulation and de-escalation. Key acoustic indicators such as root mean square (RMS), zero-crossing rate (ZCR), and jitter were identified as sensitive to emotional states, while sentiment polarity and lexical diversity (TTR) distinguished between positive and negative responses. These findings provide valuable insights for developing adaptive, context-aware VAs capable of delivering empathetic, culturally sensitive, and user-aligned responses. By understanding how humans naturally regulate emotions in AI interactions, this research contributes to the design of more intuitive and emotionally intelligent voice assistants, enhancing user trust and engagement in human-AI interactions.

Place, publisher, year, edition, pages
Springer Nature , 2025. Vol. 15802, p. 303-320
Series
Lecture Notes in Computer Science, ISSN 0302-9743
Keywords [en]
Emotion-Aware Voice Assistants, Role-Swapping Approach, Speech and Linguistic Analysis, Speech Emotion Recognition (SER)
National Category
Comparative Language Studies and Linguistics
Identifiers
URN: urn:nbn:se:kth:diva-374167DOI: 10.1007/978-3-031-92977-9_19ISI: 001551861000019Scopus ID: 2-s2.0-105007671581OAI: oai:DiVA.org:kth-374167DiVA, id: diva2:2022018
Conference
13th International Conference on Distributed Ambient and Pervasive Interactions-DAPI, jun 22-27, 2025, Gothenburg, Sweden
Note

Part of ISBN 978-3-031-92976-2; 978-3-031-92977-9

QC 20251216

Available from: 2025-12-16 Created: 2025-12-16 Last updated: 2025-12-16Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Zhang, YuchongKragic Jensfelt, Danica

Search in DiVA

By author/editor
Zhang, YuchongKragic Jensfelt, Danica
By organisation
Robotics, Perception and Learning, RPLCollaborative Autonomous Systems
Comparative Language Studies and Linguistics

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 18 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf