kth.sePublications
Change search
Link to record
Permanent link

Direct link
Miniotaitė, Jūra
Publications (6 of 6) Show all publications
Abelho Pereira, A. T., Marcinek, L., Miniotaitė, J., Thunberg, S., Lagerstedt, E., Gustafsson, J., . . . Irfan, B. (2024). Multimodal User Enjoyment Detection in Human-Robot Conversation: The Power of Large Language Models. In: : . Paper presented at 26th International Conference on Multimodal Interaction (ICMI), San Jose, USA, November 4-8, 2024 (pp. 469-478). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>Multimodal User Enjoyment Detection in Human-Robot Conversation: The Power of Large Language Models
Show others...
2024 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Enjoyment is a crucial yet complex indicator of positive user experience in Human-Robot Interaction (HRI). While manual enjoyment annotation is feasible, developing reliable automatic detection methods remains a challenge. This paper investigates a multimodal approach to automatic enjoyment annotation for HRI conversations, leveraging large language models (LLMs), visual, audio, and temporal cues. Our findings demonstrate that both text-only and multimodal LLMs with carefully designed prompts can achieve performance comparable to human annotators in detecting user enjoyment. Furthermore, results reveal a stronger alignment between LLM-based annotations and user self-reports of enjoyment compared to human annotators. While multimodal supervised learning techniques did not improve all of our performance metrics, they could successfully replicate human annotators and highlighted the importance of visual and audio cues in detecting subtle shifts in enjoyment. This research demonstrates the potential of LLMs for real-time enjoyment detection, paving the way for adaptive companion robots that can dynamically enhance user experiences.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2024
Keywords
Afect Recognition, Human-Robot Interaction, Large Language Models, Multimodal, Older Adults, User Enjoyment
National Category
Natural Language Processing
Identifiers
urn:nbn:se:kth:diva-359146 (URN)10.1145/3678957.3685729 (DOI)001433669800051 ()2-s2.0-85212589337 (Scopus ID)
Conference
26th International Conference on Multimodal Interaction (ICMI), San Jose, USA, November 4-8, 2024
Note

QC 20250127

Available from: 2025-01-27 Created: 2025-01-27 Last updated: 2025-04-30Bibliographically approved
Miniotaitė, J., Wang, S., Beskow, J., Gustafson, J., Székely, É. & Abelho Pereira, A. T. (2023). Hi robot, it's not what you say, it's how you say it. In: 2023 32ND IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, RO-MAN: . Paper presented at 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), AUG 28-31, 2023, Busan, SOUTH KOREA (pp. 307-314). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Hi robot, it's not what you say, it's how you say it
Show others...
2023 (English)In: 2023 32ND IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, RO-MAN, Institute of Electrical and Electronics Engineers (IEEE) , 2023, p. 307-314Conference paper, Published paper (Refereed)
Abstract [en]

Many robots use their voice to communicate with people in spoken language but the voices commonly used for robots are often optimized for transactional interactions, rather than social ones. This can limit their ability to create engaging and natural interactions. To address this issue, we designed a spontaneous text-to-speech tool and used it to author natural and spontaneous robot speech. A crowdsourcing evaluation methodology is proposed to compare this type of speech to natural speech and state-of-the-art text-to-speech technology, both in disembodied and embodied form. We created speech samples in a naturalistic setting of people playing tabletop games and conducted a user study evaluating Naturalness, Intelligibility, Social Impression, Prosody, and Perceived Intelligence. The speech samples were chosen to represent three contexts that are common in tabletopgames and the contexts were introduced to the participants that evaluated the speech samples. The study results show that the proposed evaluation methodology allowed for a robust analysis that successfully compared the different conditions. Moreover, the spontaneous voice met our target design goal of being perceived as more natural than a leading commercial text-to-speech.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2023
Series
IEEE RO-MAN, ISSN 1944-9445
Keywords
speech synthesis, human-robot interaction, embodiment, spontaneous speech, intelligibility, naturalness
National Category
Other Engineering and Technologies
Identifiers
urn:nbn:se:kth:diva-341972 (URN)10.1109/RO-MAN57019.2023.10309427 (DOI)001108678600044 ()2-s2.0-85186982397 (Scopus ID)
Conference
32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), AUG 28-31, 2023, Busan, SOUTH KOREA
Note

Part of proceedings ISBN 979-8-3503-3670-2

Available from: 2024-01-09 Created: 2024-01-09 Last updated: 2025-02-18Bibliographically approved
Miniotaitė, J., Pakulytė, V. & Fernaeus, Y. (2022). Gentle Gestures of Control: On the Somatic Sensibilities of an IoT Remote App. Diseña (20), 1-16, Article ID 1.
Open this publication in new window or tab >>Gentle Gestures of Control: On the Somatic Sensibilities of an IoT Remote App
2022 (English)In: Diseña, ISSN 2452-4298, no 20, p. 1-16, article id 1Article in journal (Refereed) Published
Abstract [en]

The design of user experiences for physical appliances increasingly involves connection, monitoring, and control via smartphone applications. Despite the rich possibilities for interaction provided by smartphones, the current standard mode of engagement with such apps is through graphical user interface manipulations. To explore new felt experiences for this use context, a remote-control app for a robotic vacuum cleaner was designed, enabling participants to have their gaze focused on the robot, while steering it by gently tilting the phone. This particular interaction is used as a case to emphasize the role of somatic sensibilities when designing smartphone applications in the context of IoT. Through a phenomenologically-inspired analysis, we describe the user experience in terms of physical manipulation, perception, effort, and utility, and through social and emotional engagement. An important attribute was how the interaction, through its subtleness, created a somatically connected experience.

Place, publisher, year, edition, pages
Pontificia Universidad Catolica de Chile, 2022
Keywords
domestic robots, soma design, embodied interaction, gestures, IoT experiences
National Category
Other Engineering and Technologies Human Computer Interaction
Research subject
Human-computer Interaction
Identifiers
urn:nbn:se:kth:diva-309528 (URN)10.7764/disena.20.Article.1 (DOI)2-s2.0-85150656981 (Scopus ID)
Funder
Swedish Foundation for Strategic Research, RIT15-0046
Note

QC 20220309

Available from: 2022-03-07 Created: 2022-03-07 Last updated: 2025-02-18Bibliographically approved
Misgeld, O., Gulz, T., Holzapfel, A. & Miniotaitė, J. (2021). A case study of deep enculturation and sensorimotor synchronization to real music. In: Proceedings of the 22nd International Conference on Music Information Retrieval, ISMIR 2021, International Society for Music Information Retrieval: . Paper presented at 22nd International Conference on Music Information Retrieval, ISMIR 2021, Virtual, Online, 7 November 2021- 12 November 2021 (pp. 460-467).
Open this publication in new window or tab >>A case study of deep enculturation and sensorimotor synchronization to real music
2021 (English)In: Proceedings of the 22nd International Conference on Music Information Retrieval, ISMIR 2021, International Society for Music Information Retrieval, 2021, p. 460-467Conference paper, Published paper (Refereed)
Abstract [en]

Synchronization of movement to music is a behavioural capacity that separates humans from most other species. Whereas such movements have been studied using a wide range of methods, only few studies have investigated synchronisation to real music stimuli in a cross-culturally comparative setting. The present study employs beat tracking evaluation metrics and accent histograms to analyze the differences in the ways participants from two cultural groups synchronize their tapping with either familiar or unfamiliar music stimuli. Instead of choosing two apparently remote cultural groups, we selected two groups of musicians that share cultural backgrounds, but that differ regarding the music style they specialize in. The employed method to record tapping responses in audio format facilitates a fine-grained analysis of metrical accents that emerge from the responses. The identified differences between groups are related to the metrical structures inherent to the two musical styles, such as non-isochronicity of the beat, and differences between the groups document the influence of the deep enculturation of participants to their style of expertise. Besides these findings, our study sheds light on a conceptual weakness of a common beat tracking evaluation metric, when applied to human tapping instead of machine generated beat estimations.

National Category
Computer and Information Sciences
Research subject
Media Technology
Identifiers
urn:nbn:se:kth:diva-301748 (URN)2-s2.0-85184086384 (Scopus ID)
Conference
22nd International Conference on Music Information Retrieval, ISMIR 2021, Virtual, Online, 7 November 2021- 12 November 2021
Funder
Swedish Research Council, 2019-03694Marianne and Marcus Wallenberg Foundation, 2020.0102
Note

Part pf ISBN: 978-173272990-2 

QC 20211027

Available from: 2021-09-10 Created: 2021-09-10 Last updated: 2025-02-18Bibliographically approved
Misgeld, O., Gulz, T., Holzapfel, A. & Miniotaitė, J. (2021). A CASE STUDY OF DEEP ENCULTURATION AND SENSORIMOTOR SYNCHRONIZATION TO REAL MUSIC. In: Proceedings of the International Society for Music Information Retrieval Conference: (pp. 460-467). International Society for Music Information Retrieval, 2021
Open this publication in new window or tab >>A CASE STUDY OF DEEP ENCULTURATION AND SENSORIMOTOR SYNCHRONIZATION TO REAL MUSIC
2021 (English)In: Proceedings of the International Society for Music Information Retrieval Conference, International Society for Music Information Retrieval , 2021, Vol. 2021, p. 460-467Chapter in book (Other academic)
Abstract [en]

Synchronization of movement to music is a behavioural capacity that separates humans from most other species. Whereas such movements have been studied using a wide range of methods, only few studies have investigated synchronisation to real music stimuli in a cross-culturally comparative setting. The present study employs beat tracking evaluation metrics and accent histograms to analyze the differences in the ways participants from two cultural groups synchronize their tapping with either familiar or unfamiliar music stimuli. Instead of choosing two apparently remote cultural groups, we selected two groups of musicians that share cultural backgrounds, but that differ regarding the music style they specialize in. The employed method to record tapping responses in audio format facilitates a fine-grained analysis of metrical accents that emerge from the responses. The identified differences between groups are related to the metrical structures inherent to the two musical styles, such as non-isochronicity of the beat, and differences between the groups document the influence of the deep enculturation of participants to their style of expertise. Besides these findings, our study sheds light on a conceptual weakness of a common beat tracking evaluation metric, when applied to human tapping instead of machine generated beat estimations.

Place, publisher, year, edition, pages
International Society for Music Information Retrieval, 2021
National Category
Other Engineering and Technologies
Identifiers
urn:nbn:se:kth:diva-361139 (URN)2-s2.0-85219547453 (Scopus ID)
Note

QC 20250313

Available from: 2025-03-12 Created: 2025-03-12 Last updated: 2025-03-13Bibliographically approved
Miniotaitė, J., Pakulytė, V. & Fernaeus, Y. (2021). JoyTilt: Between Autonomy and Control of a Robot Vacuum Cleaner. In: CEUR Workshop Proceedings: . Paper presented at 2021 Workshops on Computer Human Interaction in IoT Applications, CHIIoT 2021, Eindhoven, Netherlands, 8 June 2021. CEUR-WS
Open this publication in new window or tab >>JoyTilt: Between Autonomy and Control of a Robot Vacuum Cleaner
2021 (English)In: CEUR Workshop Proceedings, CEUR-WS , 2021Conference paper, Published paper (Refereed)
Abstract [en]

Domestic IoT appliances like smart speakers, smart locks and robot vacuum cleaners are usually connected, monitored and controlled via smartphone apps. Despite the rich number of sensors and actuators available in smartphones, these apps primarily provide graphical user interfaces with these appliances. To explore a more somatically engaging experience the prototype JoyTilt was designed. It is a tilt-based remote control for robotic vacuum cleaners that was developed and tested with users. JoyTilt enabled participants to have their gaze focused on the robotic vacuum cleaner while controlling it. Interviews with the participants provide suggestions for balancing control of robot vacuum cleaners while keeping the robot's autonomy. In this study the somaesthetics, the interactive materials and choice of interaction model come together in the design to shape the human-robot relationship. Lastly, the study highlights the values of further considering the bodily experience when designing apps. 

Place, publisher, year, edition, pages
CEUR-WS, 2021
Series
CEUR Workshop Proceedings, ISSN 1613-0073
Keywords
Domestic robots, Embodied design, Gestures, Human-IoT experiences, Cleaning, Domestic appliances, Graphical user interfaces, Human robot interaction, Machine design, Remote control, Robotics, Smartphones, Balancing controls, Gesture, Human-IoT experience, Robot vacuum cleaners, Robotic vacuum cleaners, Sensors and actuators, Smart phones, Smartphone apps, Internet of things
National Category
Robotics and automation Communication Systems
Identifiers
urn:nbn:se:kth:diva-316409 (URN)2-s2.0-85122946147 (Scopus ID)
Conference
2021 Workshops on Computer Human Interaction in IoT Applications, CHIIoT 2021, Eindhoven, Netherlands, 8 June 2021
Note

QC 20220816

Available from: 2022-08-16 Created: 2022-08-16 Last updated: 2025-02-05Bibliographically approved
Organisations

Search in DiVA

Show all publications