kth.sePublikationer KTH
Ändra sökning
Länk till posten
Permanent länk

Direktlänk
Alternativa namn
Publikationer (8 of 8) Visa alla publikationer
Irfan, B., Miniota, J., Thunberg, S., Lagerstedt, E., Kuoppamäki, S., Skantze, G. & Abelho Pereira, A. T. (2025). Human-Robot Interaction Conversational User Enjoyment Scale (HRI CUES). IEEE Transactions on Affective Computing
Öppna denna publikation i ny flik eller fönster >>Human-Robot Interaction Conversational User Enjoyment Scale (HRI CUES)
Visa övriga...
2025 (Engelska)Ingår i: IEEE Transactions on Affective Computing, E-ISSN 1949-3045Artikel i tidskrift (Refereegranskat) Epub ahead of print
Abstract [en]

Understanding user enjoyment is crucial in human-robot interaction (HRI), as it can impact interaction quality and influence user acceptance and long-term engagement with robots, particularly in the context of conversations with social robots. However, current assessment methods rely solely on self-reported questionnaires, failing to capture interaction dynamics. This work introduces the Human-Robot Interaction Conversational User Enjoyment Scale (HRI CUES), a novel 5-point scale to assess user enjoyment from an external perspective (e.g. by an annotator) for conversations with a robot. The scale was developed through rigorous evaluations and discussions among three annotators with relevant expertise, using open-domain conversations with a companion robot that was powered by a large language model, and was applied to each conversation exchange (i.e. a robot-participant turn pair) alongside overall interaction. It was evaluated on 25 older adults' interactions with the companion robot, corresponding to 174 minutes of data, showing moderate to good alignment between annotators. Although the scale was developed and tested in the context of older adult interactions with a robot, its basis in general and non-task-specific indicators of enjoyment supports its broader applicability. The study further offers insights into understanding the nuances and challenges of assessing user enjoyment in robot interactions, and provides guidelines on applying the scale to other domains and populations. The dataset is available online.

Ort, förlag, år, upplaga, sidor
Institute of Electrical and Electronics Engineers (IEEE), 2025
Nationell ämneskategori
Data- och informationsvetenskap
Identifikatorer
urn:nbn:se:kth:diva-374884 (URN)10.1109/TAFFC.2025.3590359 (DOI)2-s2.0-105011494748 (Scopus ID)
Anmärkning

QC 20260107

Tillgänglig från: 2026-01-06 Skapad: 2026-01-06 Senast uppdaterad: 2026-01-07Bibliografiskt granskad
Torubarova, E., Miniotaitė, J. & Abelho Pereira, A. T. (2025). Users and Wizards in Conversations: How WoZ Interface Choices Define Human-Robot Interactions. In: Proceedings of Robotics: Science and Systems: . Paper presented at Robotics: Science and Systems XXI.
Öppna denna publikation i ny flik eller fönster >>Users and Wizards in Conversations: How WoZ Interface Choices Define Human-Robot Interactions
2025 (Engelska)Ingår i: Proceedings of Robotics: Science and Systems, 2025Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

In this paper, we investigated how the choice of a Wizard-of-Oz (WoZ) interface affects communication with a robot from both the user's and the wizard's perspective. In a conversational setting, we used three WoZ interfaces with varying levels of dialogue input and output restrictions: a) a restricted perception GUI that showed fixed-view video and ASR transcripts and let the wizard trigger pre-scripted utterances and gestures; b) an unrestricted perception GUI that added real-time audio from the participant and the robot c) a VR telepresence interface that streamed immersive stereo video and audio to the wizard and forwarded the wizard's spontaneous speech, gaze and facial expressions to the robot. We found that the interaction mediated by the VR interface was preferred by users in terms of robot features and perceived social presence. For the wizards, the VR condition turned out to be the most demanding but elicited a higher social connection with the users. VR interface also induced the most connected interaction in terms of inter-speaker gaps and overlaps, while Restricted GUI induced the least connected flow and the largest silences. Given these results, we argue for more WoZ studies using telepresence interfaces. These studies better reflect the robots of tomorrow and offer a promising path to automation based on naturalistic contextualized verbal and non-verbal behavioral data. 

Nyckelord
VR, Wizard-of-Oz, teleoperation, social robotics
Nationell ämneskategori
Robotik och automation Människa-datorinteraktion (interaktionsdesign)
Forskningsämne
Människa-datorinteraktion
Identifikatorer
urn:nbn:se:kth:diva-379050 (URN)10.15607/RSS.2025.XXI.085 (DOI)
Konferens
Robotics: Science and Systems XXI
Tillgänglig från: 2026-04-07 Skapad: 2026-04-07 Senast uppdaterad: 2026-04-07
Abelho Pereira, A. T., Marcinek, L., Miniotaitė, J., Thunberg, S., Lagerstedt, E., Gustafsson, J., . . . Irfan, B. (2024). Multimodal User Enjoyment Detection in Human-Robot Conversation: The Power of Large Language Models. In: : . Paper presented at 26th International Conference on Multimodal Interaction (ICMI), San Jose, USA, November 4-8, 2024 (pp. 469-478). Association for Computing Machinery (ACM)
Öppna denna publikation i ny flik eller fönster >>Multimodal User Enjoyment Detection in Human-Robot Conversation: The Power of Large Language Models
Visa övriga...
2024 (Engelska)Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Enjoyment is a crucial yet complex indicator of positive user experience in Human-Robot Interaction (HRI). While manual enjoyment annotation is feasible, developing reliable automatic detection methods remains a challenge. This paper investigates a multimodal approach to automatic enjoyment annotation for HRI conversations, leveraging large language models (LLMs), visual, audio, and temporal cues. Our findings demonstrate that both text-only and multimodal LLMs with carefully designed prompts can achieve performance comparable to human annotators in detecting user enjoyment. Furthermore, results reveal a stronger alignment between LLM-based annotations and user self-reports of enjoyment compared to human annotators. While multimodal supervised learning techniques did not improve all of our performance metrics, they could successfully replicate human annotators and highlighted the importance of visual and audio cues in detecting subtle shifts in enjoyment. This research demonstrates the potential of LLMs for real-time enjoyment detection, paving the way for adaptive companion robots that can dynamically enhance user experiences.

Ort, förlag, år, upplaga, sidor
Association for Computing Machinery (ACM), 2024
Nyckelord
Afect Recognition, Human-Robot Interaction, Large Language Models, Multimodal, Older Adults, User Enjoyment
Nationell ämneskategori
Språkbehandling och datorlingvistik
Identifikatorer
urn:nbn:se:kth:diva-359146 (URN)10.1145/3678957.3685729 (DOI)001433669800051 ()2-s2.0-85212589337 (Scopus ID)
Konferens
26th International Conference on Multimodal Interaction (ICMI), San Jose, USA, November 4-8, 2024
Anmärkning

QC 20250127

Tillgänglig från: 2025-01-27 Skapad: 2025-01-27 Senast uppdaterad: 2025-04-30Bibliografiskt granskad
Miniotaitė, J., Wang, S., Beskow, J., Gustafson, J., Székely, É. & Abelho Pereira, A. T. (2023). Hi robot, it's not what you say, it's how you say it. In: 2023 32ND IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, RO-MAN: . Paper presented at 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), AUG 28-31, 2023, Busan, SOUTH KOREA (pp. 307-314). Institute of Electrical and Electronics Engineers (IEEE)
Öppna denna publikation i ny flik eller fönster >>Hi robot, it's not what you say, it's how you say it
Visa övriga...
2023 (Engelska)Ingår i: 2023 32ND IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, RO-MAN, Institute of Electrical and Electronics Engineers (IEEE) , 2023, s. 307-314Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Many robots use their voice to communicate with people in spoken language but the voices commonly used for robots are often optimized for transactional interactions, rather than social ones. This can limit their ability to create engaging and natural interactions. To address this issue, we designed a spontaneous text-to-speech tool and used it to author natural and spontaneous robot speech. A crowdsourcing evaluation methodology is proposed to compare this type of speech to natural speech and state-of-the-art text-to-speech technology, both in disembodied and embodied form. We created speech samples in a naturalistic setting of people playing tabletop games and conducted a user study evaluating Naturalness, Intelligibility, Social Impression, Prosody, and Perceived Intelligence. The speech samples were chosen to represent three contexts that are common in tabletopgames and the contexts were introduced to the participants that evaluated the speech samples. The study results show that the proposed evaluation methodology allowed for a robust analysis that successfully compared the different conditions. Moreover, the spontaneous voice met our target design goal of being perceived as more natural than a leading commercial text-to-speech.

Ort, förlag, år, upplaga, sidor
Institute of Electrical and Electronics Engineers (IEEE), 2023
Serie
IEEE RO-MAN, ISSN 1944-9445
Nyckelord
speech synthesis, human-robot interaction, embodiment, spontaneous speech, intelligibility, naturalness
Nationell ämneskategori
Annan teknik
Identifikatorer
urn:nbn:se:kth:diva-341972 (URN)10.1109/RO-MAN57019.2023.10309427 (DOI)001108678600044 ()2-s2.0-85186982397 (Scopus ID)
Konferens
32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), AUG 28-31, 2023, Busan, SOUTH KOREA
Anmärkning

Part of proceedings ISBN 979-8-3503-3670-2

Tillgänglig från: 2024-01-09 Skapad: 2024-01-09 Senast uppdaterad: 2025-02-18Bibliografiskt granskad
Miniotaitė, J., Pakulytė, V. & Fernaeus, Y. (2022). Gentle Gestures of Control: On the Somatic Sensibilities of an IoT Remote App. Diseña (20), 1-16, Article ID 1.
Öppna denna publikation i ny flik eller fönster >>Gentle Gestures of Control: On the Somatic Sensibilities of an IoT Remote App
2022 (Engelska)Ingår i: Diseña, ISSN 2452-4298, nr 20, s. 1-16, artikel-id 1Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

The design of user experiences for physical appliances increasingly involves connection, monitoring, and control via smartphone applications. Despite the rich possibilities for interaction provided by smartphones, the current standard mode of engagement with such apps is through graphical user interface manipulations. To explore new felt experiences for this use context, a remote-control app for a robotic vacuum cleaner was designed, enabling participants to have their gaze focused on the robot, while steering it by gently tilting the phone. This particular interaction is used as a case to emphasize the role of somatic sensibilities when designing smartphone applications in the context of IoT. Through a phenomenologically-inspired analysis, we describe the user experience in terms of physical manipulation, perception, effort, and utility, and through social and emotional engagement. An important attribute was how the interaction, through its subtleness, created a somatically connected experience.

Ort, förlag, år, upplaga, sidor
Pontificia Universidad Catolica de Chile, 2022
Nyckelord
domestic robots, soma design, embodied interaction, gestures, IoT experiences
Nationell ämneskategori
Annan teknik Människa-datorinteraktion (interaktionsdesign)
Forskningsämne
Människa-datorinteraktion
Identifikatorer
urn:nbn:se:kth:diva-309528 (URN)10.7764/disena.20.Article.1 (DOI)2-s2.0-85150656981 (Scopus ID)
Forskningsfinansiär
Stiftelsen för strategisk forskning (SSF), RIT15-0046
Anmärkning

QC 20251218

Tillgänglig från: 2022-03-07 Skapad: 2022-03-07 Senast uppdaterad: 2025-12-18Bibliografiskt granskad
Misgeld, O., Gulz, T., Holzapfel, A. & Miniotaitė, J. (2021). A case study of deep enculturation and sensorimotor synchronization to real music. In: Proceedings of the 22nd International Conference on Music Information Retrieval, ISMIR 2021, International Society for Music Information Retrieval: . Paper presented at 22nd International Conference on Music Information Retrieval, ISMIR 2021, Virtual, Online, 7 November 2021- 12 November 2021 (pp. 460-467).
Öppna denna publikation i ny flik eller fönster >>A case study of deep enculturation and sensorimotor synchronization to real music
2021 (Engelska)Ingår i: Proceedings of the 22nd International Conference on Music Information Retrieval, ISMIR 2021, International Society for Music Information Retrieval, 2021, s. 460-467Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Synchronization of movement to music is a behavioural capacity that separates humans from most other species. Whereas such movements have been studied using a wide range of methods, only few studies have investigated synchronisation to real music stimuli in a cross-culturally comparative setting. The present study employs beat tracking evaluation metrics and accent histograms to analyze the differences in the ways participants from two cultural groups synchronize their tapping with either familiar or unfamiliar music stimuli. Instead of choosing two apparently remote cultural groups, we selected two groups of musicians that share cultural backgrounds, but that differ regarding the music style they specialize in. The employed method to record tapping responses in audio format facilitates a fine-grained analysis of metrical accents that emerge from the responses. The identified differences between groups are related to the metrical structures inherent to the two musical styles, such as non-isochronicity of the beat, and differences between the groups document the influence of the deep enculturation of participants to their style of expertise. Besides these findings, our study sheds light on a conceptual weakness of a common beat tracking evaluation metric, when applied to human tapping instead of machine generated beat estimations.

Nationell ämneskategori
Data- och informationsvetenskap
Forskningsämne
Medieteknik
Identifikatorer
urn:nbn:se:kth:diva-301748 (URN)2-s2.0-85184086384 (Scopus ID)
Konferens
22nd International Conference on Music Information Retrieval, ISMIR 2021, Virtual, Online, 7 November 2021- 12 November 2021
Forskningsfinansiär
Vetenskapsrådet, 2019-03694Marianne och Marcus Wallenbergs Stiftelse, 2020.0102
Anmärkning

Part pf ISBN: 978-173272990-2 

QC 20211027

Tillgänglig från: 2021-09-10 Skapad: 2021-09-10 Senast uppdaterad: 2025-02-18Bibliografiskt granskad
Misgeld, O., Gulz, T., Holzapfel, A. & Miniotaitė, J. (2021). A CASE STUDY OF DEEP ENCULTURATION AND SENSORIMOTOR SYNCHRONIZATION TO REAL MUSIC. In: Proceedings of the International Society for Music Information Retrieval Conference: (pp. 460-467). International Society for Music Information Retrieval, 2021
Öppna denna publikation i ny flik eller fönster >>A CASE STUDY OF DEEP ENCULTURATION AND SENSORIMOTOR SYNCHRONIZATION TO REAL MUSIC
2021 (Engelska)Ingår i: Proceedings of the International Society for Music Information Retrieval Conference, International Society for Music Information Retrieval , 2021, Vol. 2021, s. 460-467Kapitel i bok, del av antologi (Övrigt vetenskapligt)
Abstract [en]

Synchronization of movement to music is a behavioural capacity that separates humans from most other species. Whereas such movements have been studied using a wide range of methods, only few studies have investigated synchronisation to real music stimuli in a cross-culturally comparative setting. The present study employs beat tracking evaluation metrics and accent histograms to analyze the differences in the ways participants from two cultural groups synchronize their tapping with either familiar or unfamiliar music stimuli. Instead of choosing two apparently remote cultural groups, we selected two groups of musicians that share cultural backgrounds, but that differ regarding the music style they specialize in. The employed method to record tapping responses in audio format facilitates a fine-grained analysis of metrical accents that emerge from the responses. The identified differences between groups are related to the metrical structures inherent to the two musical styles, such as non-isochronicity of the beat, and differences between the groups document the influence of the deep enculturation of participants to their style of expertise. Besides these findings, our study sheds light on a conceptual weakness of a common beat tracking evaluation metric, when applied to human tapping instead of machine generated beat estimations.

Ort, förlag, år, upplaga, sidor
International Society for Music Information Retrieval, 2021
Nationell ämneskategori
Annan teknik
Identifikatorer
urn:nbn:se:kth:diva-361139 (URN)2-s2.0-85219547453 (Scopus ID)
Anmärkning

QC 20250313

Tillgänglig från: 2025-03-12 Skapad: 2025-03-12 Senast uppdaterad: 2025-03-13Bibliografiskt granskad
Miniotaitė, J., Pakulytė, V. & Fernaeus, Y. (2021). JoyTilt: Between Autonomy and Control of a Robot Vacuum Cleaner. In: CEUR Workshop Proceedings: . Paper presented at 2021 Workshops on Computer Human Interaction in IoT Applications, CHIIoT 2021, Eindhoven, Netherlands, 8 June 2021. CEUR-WS
Öppna denna publikation i ny flik eller fönster >>JoyTilt: Between Autonomy and Control of a Robot Vacuum Cleaner
2021 (Engelska)Ingår i: CEUR Workshop Proceedings, CEUR-WS , 2021Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Domestic IoT appliances like smart speakers, smart locks and robot vacuum cleaners are usually connected, monitored and controlled via smartphone apps. Despite the rich number of sensors and actuators available in smartphones, these apps primarily provide graphical user interfaces with these appliances. To explore a more somatically engaging experience the prototype JoyTilt was designed. It is a tilt-based remote control for robotic vacuum cleaners that was developed and tested with users. JoyTilt enabled participants to have their gaze focused on the robotic vacuum cleaner while controlling it. Interviews with the participants provide suggestions for balancing control of robot vacuum cleaners while keeping the robot's autonomy. In this study the somaesthetics, the interactive materials and choice of interaction model come together in the design to shape the human-robot relationship. Lastly, the study highlights the values of further considering the bodily experience when designing apps. 

Ort, förlag, år, upplaga, sidor
CEUR-WS, 2021
Serie
CEUR Workshop Proceedings, ISSN 1613-0073
Nyckelord
Domestic robots, Embodied design, Gestures, Human-IoT experiences, Cleaning, Domestic appliances, Graphical user interfaces, Human robot interaction, Machine design, Remote control, Robotics, Smartphones, Balancing controls, Gesture, Human-IoT experience, Robot vacuum cleaners, Robotic vacuum cleaners, Sensors and actuators, Smart phones, Smartphone apps, Internet of things
Nationell ämneskategori
Robotik och automation Kommunikationssystem
Identifikatorer
urn:nbn:se:kth:diva-316409 (URN)2-s2.0-85122946147 (Scopus ID)
Konferens
2021 Workshops on Computer Human Interaction in IoT Applications, CHIIoT 2021, Eindhoven, Netherlands, 8 June 2021
Anmärkning

QC 20220816

Tillgänglig från: 2022-08-16 Skapad: 2022-08-16 Senast uppdaterad: 2025-02-05Bibliografiskt granskad
Organisationer
Identifikatorer
ORCID-id: ORCID iD iconorcid.org/0009-0006-2058-0112

Sök vidare i DiVA

Visa alla publikationer