kth.sePublications KTH
Change search
Link to record
Permanent link

Direct link
Publications (10 of 13) Show all publications
Skantze, G. & Irfan, B. (2025). Applying General Turn-Taking Models to Conversational Human-Robot Interaction. In: HRI 2025 - Proceedings of the 2025 ACM/IEEE International Conference on Human-Robot Interaction: . Paper presented at 20th Annual ACM/IEEE International Conference on Human-Robot Interaction, HRI 2025, Melbourne, Australia, Mar 4 2025 - Mar 6 2025 (pp. 859-868). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Applying General Turn-Taking Models to Conversational Human-Robot Interaction
2025 (English)In: HRI 2025 - Proceedings of the 2025 ACM/IEEE International Conference on Human-Robot Interaction, Institute of Electrical and Electronics Engineers (IEEE) , 2025, p. 859-868Conference paper, Published paper (Refereed)
Abstract [en]

Turn-taking is a fundamental aspect of conversation, but current Human-Robot Interaction (HRI) systems often rely on simplistic, silence-based models, leading to unnatural pauses and interruptions. This paper investigates, for the first time, the application of general turn-taking models, specifically TurnGPT and Voice Activity Projection (VAP), to improve conversational dynamics in HRI. These models are trained on human-human dialogue data using self-supervised learning objectives, without requiring domain-specific fine-tuning. We propose methods for using these models in tandem to predict when a robot should begin preparing responses, take turns, and handle potential interruptions. We evaluated the proposed system in a within-subject study against a traditional baseline system, using the Furhat robot with 39 adults in a conversational setting, in combination with a large language model for autonomous response generation. The results show that participants significantly prefer the proposed system, and it significantly reduces response delays and interruptions.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2025
Keywords
conversational AI, human-robot interaction, large language model, turn-taking
National Category
Natural Language Processing Computer Sciences Human Computer Interaction
Identifiers
urn:nbn:se:kth:diva-363767 (URN)10.1109/HRI61500.2025.10973958 (DOI)2-s2.0-105004876033 (Scopus ID)
Conference
20th Annual ACM/IEEE International Conference on Human-Robot Interaction, HRI 2025, Melbourne, Australia, Mar 4 2025 - Mar 6 2025
Note

Part of ISBN 9798350378931

QC 20250527

Available from: 2025-05-21 Created: 2025-05-21 Last updated: 2025-05-27Bibliographically approved
Irfan, B., Kuoppamäki, S., Hosseini, A. & Skantze, G. (2025). Between reality and delusion: challenges of applying large language models to companion robots for open-domain dialogues with older adults. Autonomous Robots, 49(1), Article ID 9.
Open this publication in new window or tab >>Between reality and delusion: challenges of applying large language models to companion robots for open-domain dialogues with older adults
2025 (English)In: Autonomous Robots, ISSN 0929-5593, E-ISSN 1573-7527, Vol. 49, no 1, article id 9Article in journal (Refereed) Published
Abstract [en]

Throughout our lives, we interact daily in conversations with our friends and family, covering a wide range of topics, known as open-domain dialogue. As we age, these interactions may diminish due to changes in social and personal relationships, leading to loneliness in older adults. Conversational companion robots can alleviate this issue by providing daily social support. Large language models (LLMs) offer flexibility for enabling open-domain dialogue in these robots. However, LLMs are typically trained and evaluated on textual data, while robots introduce additional complexity through multi-modal interactions, which has not been explored in prior studies. Moreover, it is crucial to involve older adults in the development of robots to ensure alignment with their needs and expectations. Correspondingly, using iterative participatory design approaches, this paper exposes the challenges of integrating LLMs into conversational robots, deriving from 34 Swedish-speaking older adults' (one-to-one) interactions with a personalized companion robot, built on Furhat robot with GPT-\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}3.5. These challenges encompass disruptions in conversations, including frequent interruptions, slow, repetitive, superficial, incoherent, and disengaging responses, language barriers, hallucinations, and outdated information, leading to frustration, confusion, and worry among older adults. Drawing on insights from these challenges, we offer recommendations to enhance the integration of LLMs into conversational robots, encompassing both general suggestions and those tailored to companion robots for older adults.

Place, publisher, year, edition, pages
Springer Nature, 2025
Keywords
Large language models, Companion robot, Elderly care, Open-domain dialogue, Socially assistive robot, Participatory design
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-361621 (URN)10.1007/s10514-025-10190-y (DOI)001440005600001 ()2-s2.0-86000731912 (Scopus ID)
Note

QC 20250324

Available from: 2025-03-24 Created: 2025-03-24 Last updated: 2025-03-24Bibliographically approved
Irfan, B. & Skantze, G. (2025). Between You and Me: Ethics of Self-Disclosure in Human-Robot Interaction. In: HRI 2025 - Proceedings of the 2025 ACM/IEEE International Conference on Human-Robot Interaction: . Paper presented at 20th Annual ACM/IEEE International Conference on Human-Robot Interaction, HRI 2025, Melbourne, Australia, March 4-6, 2025 (pp. 1357-1362). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Between You and Me: Ethics of Self-Disclosure in Human-Robot Interaction
2025 (English)In: HRI 2025 - Proceedings of the 2025 ACM/IEEE International Conference on Human-Robot Interaction, Institute of Electrical and Electronics Engineers (IEEE) , 2025, p. 1357-1362Conference paper, Published paper (Refereed)
Abstract [en]

As we move toward a future where robots are increasingly part of daily life, the privacy risks associated with interactions, particularly those relying on cloud-based large language models (LLMs), are becoming more pressing. Users may unknowingly share sensitive information in environments, such as homes or hospitals. To explore these risks, we conducted a study with 39 native English speakers using a Furhat robot with an integrated LLM. Participants discussed two moral dilemmas: (i) dishonesty, sharing personal stories of justified lying, and (ii) robot disobedience, discussing whether robots should disobey commands. On average, participants disclosed personal stories 45% of the time when asked in both scenarios. The main reason for non-disclosure was difficulty recalling examples quickly (33.3-56%), rather than reluctance to share (7.2-16%). However, most participants reported a lack of discomfort and concern about sharing personal information with the robot, indicating limited awareness of the privacy risks involved in such disclosures.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2025
Keywords
ethics, human-robot interaction, large language model, moral dilemmas, privacy, self-disclosure
National Category
Human Computer Interaction Robotics and automation Ethics
Identifiers
urn:nbn:se:kth:diva-363756 (URN)10.1109/HRI61500.2025.10974215 (DOI)2-s2.0-105004876468 (Scopus ID)
Conference
20th Annual ACM/IEEE International Conference on Human-Robot Interaction, HRI 2025, Melbourne, Australia, March 4-6, 2025
Note

Part of ISBN 9798350378931

QC 20250525

Available from: 2025-05-21 Created: 2025-05-21 Last updated: 2025-05-25Bibliographically approved
Irfan, B., Miniota, J., Thunberg, S., Lagerstedt, E., Kuoppamäki, S., Skantze, G. & Abelho Pereira, A. T. (2025). Human-Robot Interaction Conversational User Enjoyment Scale (HRI CUES). IEEE Transactions on Affective Computing
Open this publication in new window or tab >>Human-Robot Interaction Conversational User Enjoyment Scale (HRI CUES)
Show others...
2025 (English)In: IEEE Transactions on Affective Computing, E-ISSN 1949-3045Article in journal (Refereed) Epub ahead of print
Abstract [en]

Understanding user enjoyment is crucial in human-robot interaction (HRI), as it can impact interaction quality and influence user acceptance and long-term engagement with robots, particularly in the context of conversations with social robots. However, current assessment methods rely solely on self-reported questionnaires, failing to capture interaction dynamics. This work introduces the Human-Robot Interaction Conversational User Enjoyment Scale (HRI CUES), a novel 5-point scale to assess user enjoyment from an external perspective (e.g. by an annotator) for conversations with a robot. The scale was developed through rigorous evaluations and discussions among three annotators with relevant expertise, using open-domain conversations with a companion robot that was powered by a large language model, and was applied to each conversation exchange (i.e. a robot-participant turn pair) alongside overall interaction. It was evaluated on 25 older adults' interactions with the companion robot, corresponding to 174 minutes of data, showing moderate to good alignment between annotators. Although the scale was developed and tested in the context of older adult interactions with a robot, its basis in general and non-task-specific indicators of enjoyment supports its broader applicability. The study further offers insights into understanding the nuances and challenges of assessing user enjoyment in robot interactions, and provides guidelines on applying the scale to other domains and populations. The dataset is available online.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2025
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-374884 (URN)10.1109/TAFFC.2025.3590359 (DOI)2-s2.0-105011494748 (Scopus ID)
Note

QC 20260107

Available from: 2026-01-06 Created: 2026-01-06 Last updated: 2026-01-07Bibliographically approved
Irfan, B., Churamani, N., Zhao, M., Ayub, A. & Rossi, S. (2025). Lifelong Learning and Personalization in Long-Term Human-Robot Interaction (LEAP-HRI): Overcoming Inequalities with Adaptation. In: HRI 2025 - Proceedings of the 2025 ACM/IEEE International Conference on Human-Robot Interaction: . Paper presented at 20th Annual ACM/IEEE International Conference on Human-Robot Interaction, HRI 2025, Melbourne, Australia, March 4-6, 2025 (pp. 1970-1972). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Lifelong Learning and Personalization in Long-Term Human-Robot Interaction (LEAP-HRI): Overcoming Inequalities with Adaptation
Show others...
2025 (English)In: HRI 2025 - Proceedings of the 2025 ACM/IEEE International Conference on Human-Robot Interaction, Institute of Electrical and Electronics Engineers (IEEE) , 2025, p. 1970-1972Conference paper, Published paper (Refereed)
Abstract [en]

Global inequalities in access to essential resources such as education, healthcare, and technology continue to widen social and economic disparities, especially in underserved and underrepresented communities. The growing integration of foundation models and other machine learning systems in robots offers promising and personalized solutions that can adapt to various individuals, situations, and environments, potentially addressing some of these gaps. By learning from interactions and evolving with local conditions, these systems can provide individualized support, such as assisting older adults with daily tasks, aiding children with special needs in learning environments, or empowering people with disabilities to live more independently. Building trust and fostering collaboration between humans and robots will help ensure that these systems meet the unique needs of all individuals, especially within long-term human-robot interaction (HRI). With this year's theme of 'Overcoming Inequalities with Adaptation', in line with the overall theme of the conference 'Robots for a Sustainable World', the fifth edition of the 'Lifelong Learning and Personalization in Long-Term Human-Robot Interaction (LEAP-HRI)'l workshop aims to bring together insights across diverse disciplines, exploring how continually evolving robots can effectively operate in diverse environments, promoting greater equity, inclusivity, and empowerment for individuals and communities. The workshop aims to facilitate collaborations across diverse scientific perspectives through a keynote presentation, panel discussions, and in-depth discussions on the contributed talks, attempting to shape a more sustainable and equitable future through adaptive advancements in long-term HRI.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2025
Keywords
adaptation, continual learning, Human-Robot Interaction (HRI), inequalities, lifelong learning, long-term HRI, personalization, user modeling, workshop
National Category
Human Computer Interaction Robotics and automation
Identifiers
urn:nbn:se:kth:diva-363762 (URN)10.1109/HRI61500.2025.10973812 (DOI)2-s2.0-105004879520 (Scopus ID)
Conference
20th Annual ACM/IEEE International Conference on Human-Robot Interaction, HRI 2025, Melbourne, Australia, March 4-6, 2025
Note

Part of ISBN 9798350378931

QC 20250525

Available from: 2025-05-21 Created: 2025-05-21 Last updated: 2025-05-25Bibliographically approved
Janssens, R., Pereira, A., Skantze, G., Irfan, B. & Belpaeme, T. (2025). Online Prediction of User Enjoyment in Human-Robot Dialogue with LLMs. In: HRI 2025 - Proceedings of the 2025 ACM/IEEE International Conference on Human-Robot Interaction: . Paper presented at 20th Annual ACM/IEEE International Conference on Human-Robot Interaction, HRI 2025, Melbourne, Australia, March 4-6, 2025 (pp. 1363-1367). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Online Prediction of User Enjoyment in Human-Robot Dialogue with LLMs
Show others...
2025 (English)In: HRI 2025 - Proceedings of the 2025 ACM/IEEE International Conference on Human-Robot Interaction, Institute of Electrical and Electronics Engineers (IEEE) , 2025, p. 1363-1367Conference paper, Published paper (Refereed)
Abstract [en]

Large Language Models (LLMs) allow social robots to engage in unconstrained open-domain dialogue, but often make mistakes when employed in real-world interactions, requiring adaptation of LLMs to specific conversational contexts. However, LLM adaptation techniques require a feedback signal, ideally for multiple alternative utterances. At the same time, human-robot dialogue data is scarce and research often relies on external annotators. A tool for automatic prediction of user enjoyment in human-robot dialogue is therefore needed. We investigate the possibility of predicting user enjoyment turn-by-turn using an LLM, giving it a proposed robot utterance within the dialogue context, but without access to user response. We compare this performance to the system's enjoyment ratings when user responses are available and to assessments by expert human annotators, in addition to self-reported user perceptions. We evaluate the proposed LLM predictor in a human-robot interaction (HRI) dataset with conversation transcripts of 25 older adults' 7-minute dialogues with a companion robot. Our results show that an LLM is capable of predicting user enjoyment, without loss of performance despite the lack of user response and even achieving performance similar to that of human expert annotators. Furthermore, results show that the system surpasses expert annotators in its correlation with the user's self-reported perceptions of the conversation. This work presents a tool to remove the reliance on external annotators for enjoyment evaluation and paves the way toward real-time adaptation in human-robot dialogue.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2025
Keywords
human-robot interaction, large language model, open-domain dialogue, prediction, user enjoyment
National Category
Natural Language Processing Computer Sciences Robotics and automation Human Computer Interaction
Identifiers
urn:nbn:se:kth:diva-363754 (URN)10.1109/HRI61500.2025.10973944 (DOI)2-s2.0-105004873166 (Scopus ID)
Conference
20th Annual ACM/IEEE International Conference on Human-Robot Interaction, HRI 2025, Melbourne, Australia, March 4-6, 2025
Note

Part of ISBN 9798350378931

QC 20250525

Available from: 2025-05-21 Created: 2025-05-21 Last updated: 2025-05-25Bibliographically approved
Marcinek, L., Irfan, B., Skantze, G., Abelho Pereira, A. T. & Gustafsson, J. (2025). Role of Reasoning in LLM Enjoyment Detection: Evaluation Across Conversational Levels for Human-Robot Interaction. In: Frédéric Béchet, Fabrice Lefèvre, Nicholas Asher, Seokhwan Kim, Teva Merlin (Ed.), Proceedings of the 26th Annual Meeting of the Special Interest Group on Discourse and Dialogue: . Paper presented at The 26th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Avignon, France, Aug 25-27, 2025 (pp. 573-590). SIGDIAL
Open this publication in new window or tab >>Role of Reasoning in LLM Enjoyment Detection: Evaluation Across Conversational Levels for Human-Robot Interaction
Show others...
2025 (English)In: Proceedings of the 26th Annual Meeting of the Special Interest Group on Discourse and Dialogue / [ed] Frédéric Béchet, Fabrice Lefèvre, Nicholas Asher, Seokhwan Kim, Teva Merlin, SIGDIAL , 2025, p. 573-590Conference paper, Published paper (Refereed)
Abstract [en]

User enjoyment is central to developing conversational AI systems that can recover from failures and maintain interest over time. However, existing approaches often struggle to detect subtle cues that reflect user experience. Large Language Models (LLMs) with reasoning capabilities have outperformed standard models on various other tasks, suggesting potential benefits for enjoyment detection. This study investigates whether models with reasoning capabilities outperform standard models when assessing enjoyment in a human-robot dialogue corpus at both turn and interaction levels. Results indicate that reasoning capabilities have complex, model-dependent effects rather than universal benefits. While performance was nearly identical at the interaction level (0.44 vs 0.43), reasoning models substantially outperformed at the turn level (0.42 vs 0.36). Notably, LLMs correlated better with users’ self-reported enjoyment metrics than human annotators, despite achieving lower accuracy against human consensus ratings. Analysis revealed distinctive error patterns: non-reasoning models showed bias toward positive ratings at the turn level, while both model types exhibited central tendency bias at the interaction level. These findings suggest that reasoning should be applied selectively based on model architecture and assessment context, with assessment granularity significantly influencing relative effectiveness.

Place, publisher, year, edition, pages
SIGDIAL, 2025
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-374881 (URN)
Conference
The 26th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Avignon, France, Aug 25-27, 2025
Note

Part of ISBN 979-8-89176-329-6

QC 20260107

Available from: 2026-01-06 Created: 2026-01-06 Last updated: 2026-01-07Bibliographically approved
Santana, R., Irfan, B., Lagerstedt, E., Skantze, G. & Abelho Pereira, A. T. (2025). Speech-to-Joy: Self-Supervised Features for Enjoyment Prediction in Human-Robot Conversation. In: Ram Subramanian, Yukiko I. Nakano, Tom Gedeon, Mohan Kankanhalli, Tanaya Guha, Jainendra Shukla, Gelareh Mohammadi, Oya Celiktutan (Ed.), Proceedings of the 27th International Conference on Multimodal Interaction, ICMI 2025: . Paper presented at The 27th International Conference on Multimodal Interaction, ICMI 2025, Canberra, Australia, October 13-17, 2025 (pp. 238-248). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>Speech-to-Joy: Self-Supervised Features for Enjoyment Prediction in Human-Robot Conversation
Show others...
2025 (English)In: Proceedings of the 27th International Conference on Multimodal Interaction, ICMI 2025 / [ed] Ram Subramanian, Yukiko I. Nakano, Tom Gedeon, Mohan Kankanhalli, Tanaya Guha, Jainendra Shukla, Gelareh Mohammadi, Oya Celiktutan, Association for Computing Machinery (ACM) , 2025, p. 238-248Conference paper, Published paper (Refereed)
Abstract [en]

Conversational systems that interact or collaborate with people must understand not only task success but also the quality of human experience. We present Speech-to-Joy, a lightweight framework that learns to predict users’ own post-interaction enjoyment ratings using latent embeddings from audio and text modalities. Evaluated on a corpus of human-robot dialogues, the model’s predicted enjoyment correlates strongly and significantly with user self-reports, outperforming both an experienced HRI annotator and heavier LLM-based uni- and multimodal baselines. Notably, even the unimodal audio branch - using only frozen speech embeddings - surpasses all baselines, and a late-fusion of text and audio achieves the highest performance. Designed for real-time inference on resource-limited platforms, Speech-to-Joy replaces ad-hoc emotion heuristics with a direct and user-centered measure of enjoyment. This work paves the way for optimizing interactions with robots and other conversational systems through the lens that matters most: the user’s own experience.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2025
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-374889 (URN)10.1145/3716553.3750747 (DOI)2-s2.0-105022238812 (Scopus ID)
Conference
The 27th International Conference on Multimodal Interaction, ICMI 2025, Canberra, Australia, October 13-17, 2025
Note

Part of ISBN 979-8-4007-1499-3

QC 20260107

Available from: 2026-01-06 Created: 2026-01-06 Last updated: 2026-01-07Bibliographically approved
Irfan, B., Staffa, M., Bobu, A. & Churamani, N. (2024). Lifelong Learning and Personalization in Long-Term Human-Robot Interaction (LEAP-HRI): Open-World Learning. In: HRI 2024 Companion - Companion of the 2024 ACM/IEEE International Conference on Human-Robot Interaction: . Paper presented at 19th Annual ACM/IEEE International Conference on Human-Robot Interaction, HRI 2024, Boulder, United States of America, Mar 11 2024 - Mar 15 2024 (pp. 1323-1325). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>Lifelong Learning and Personalization in Long-Term Human-Robot Interaction (LEAP-HRI): Open-World Learning
2024 (English)In: HRI 2024 Companion - Companion of the 2024 ACM/IEEE International Conference on Human-Robot Interaction, Association for Computing Machinery (ACM) , 2024, p. 1323-1325Conference paper, Published paper (Refereed)
Abstract [en]

The complex and largely unstructured nature of real-world situations makes it challenging for conventional closed-world robot learning solutions to adapt to such interaction dynamics. These challenges become particularly pronounced in long-term interactions where robots need to go beyond their past learning to continuously evolve with changing environment settings and personalize towards individual user behaviors. In contrast, open-world learning embraces the complexity and unpredictability of the real world, enabling robots to be “lifelong learners” that continuously acquire new knowledge and navigate novel challenges, making them more context-aware while intuitively engaging the users. Adopting the theme of “open-world learning”, the fourth edition of the “Lifelong Learning and Personalization in Long-Term Human-Robot Interaction (LEAP-HRI)”1 workshop seeks to bring together interdisciplinary perspectives on real-world applications in human-robot interaction (HRI), including education, rehabilitation, elderly care, service, and companionship. The goal of the workshop is to foster collaboration and understanding across diverse scientific communities through invited keynote presentations and in-depth discussions facilitated by contributed talks, a break-out session, and a debate.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2024
Series
ACM/IEEE International Conference on Human-Robot Interaction, ISSN 2167-2148
Keywords
Adaptation, Continual Learning, Human-Robot Interaction, Lifelong Learning, Open-World Learning, Personalization, Workshop
National Category
Human Computer Interaction
Identifiers
urn:nbn:se:kth:diva-344806 (URN)10.1145/3610978.3638159 (DOI)001255070800287 ()2-s2.0-85188103162 (Scopus ID)
Conference
19th Annual ACM/IEEE International Conference on Human-Robot Interaction, HRI 2024, Boulder, United States of America, Mar 11 2024 - Mar 15 2024
Note

QC 20240402

Part of ISBN 9798400703232

Available from: 2024-03-28 Created: 2024-03-28 Last updated: 2024-09-03Bibliographically approved
Abelho Pereira, A. T., Marcinek, L., Miniotaitė, J., Thunberg, S., Lagerstedt, E., Gustafsson, J., . . . Irfan, B. (2024). Multimodal User Enjoyment Detection in Human-Robot Conversation: The Power of Large Language Models. In: : . Paper presented at 26th International Conference on Multimodal Interaction (ICMI), San Jose, USA, November 4-8, 2024 (pp. 469-478). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>Multimodal User Enjoyment Detection in Human-Robot Conversation: The Power of Large Language Models
Show others...
2024 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Enjoyment is a crucial yet complex indicator of positive user experience in Human-Robot Interaction (HRI). While manual enjoyment annotation is feasible, developing reliable automatic detection methods remains a challenge. This paper investigates a multimodal approach to automatic enjoyment annotation for HRI conversations, leveraging large language models (LLMs), visual, audio, and temporal cues. Our findings demonstrate that both text-only and multimodal LLMs with carefully designed prompts can achieve performance comparable to human annotators in detecting user enjoyment. Furthermore, results reveal a stronger alignment between LLM-based annotations and user self-reports of enjoyment compared to human annotators. While multimodal supervised learning techniques did not improve all of our performance metrics, they could successfully replicate human annotators and highlighted the importance of visual and audio cues in detecting subtle shifts in enjoyment. This research demonstrates the potential of LLMs for real-time enjoyment detection, paving the way for adaptive companion robots that can dynamically enhance user experiences.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2024
Keywords
Afect Recognition, Human-Robot Interaction, Large Language Models, Multimodal, Older Adults, User Enjoyment
National Category
Natural Language Processing
Identifiers
urn:nbn:se:kth:diva-359146 (URN)10.1145/3678957.3685729 (DOI)001433669800051 ()2-s2.0-85212589337 (Scopus ID)
Conference
26th International Conference on Multimodal Interaction (ICMI), San Jose, USA, November 4-8, 2024
Note

QC 20250127

Available from: 2025-01-27 Created: 2025-01-27 Last updated: 2025-04-30Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-7983-079X

Search in DiVA

Show all publications