kth.sePublications
Change search
Link to record
Permanent link

Direct link
Abelho Pereira, André TiagoORCID iD iconorcid.org/0000-0003-2428-0468
Publications (10 of 24) Show all publications
Arvidsson, C., Torubarova, E., Abelho Pereira, A. T. & Udden, J. (2024). Conversational production and comprehension: fMRI-evidence reminiscent of but deviant from the classical Broca-Wernicke model. Cerebral Cortex, 34(3), Article ID bhae073.
Open this publication in new window or tab >>Conversational production and comprehension: fMRI-evidence reminiscent of but deviant from the classical Broca-Wernicke model
2024 (English)In: Cerebral Cortex, ISSN 1047-3211, E-ISSN 1460-2199, Vol. 34, no 3, article id bhae073Article in journal (Refereed) Published
Abstract [en]

A key question in research on the neurobiology of language is to which extent the language production and comprehension systems share neural infrastructure, but this question has not been addressed in the context of conversation. We utilized a public fMRI dataset where 24 participants engaged in unscripted conversations with a confederate outside the scanner, via an audio-video link. We provide evidence indicating that the two systems share neural infrastructure in the left-lateralized perisylvian language network, but diverge regarding the level of activation in regions within the network. Activity in the left inferior frontal gyrus was stronger in production compared to comprehension, while comprehension showed stronger recruitment of the left anterior middle temporal gyrus and superior temporal sulcus, compared to production. Although our results are reminiscent of the classical Broca-Wernicke model, the anterior (rather than posterior) temporal activation is a notable difference from that model. This is one of the findings that may be a consequence of the conversational setting, another being that conversational production activated what we interpret as higher-level socio-pragmatic processes. In conclusion, we present evidence for partial overlap and functional asymmetry of the neural infrastructure of production and comprehension, in the above-mentioned frontal vs temporal regions during conversation.

Place, publisher, year, edition, pages
Oxford University Press (OUP), 2024
Keywords
interaction, contextual language processing, LIFG, LMTG, functional asymmetry
National Category
Languages and Literature
Identifiers
urn:nbn:se:kth:diva-351444 (URN)10.1093/cercor/bhae073 (DOI)001273703700001 ()38501383 (PubMedID)2-s2.0-85188194135 (Scopus ID)
Note

QC 20240815

Available from: 2024-08-15 Created: 2024-08-15 Last updated: 2024-08-15Bibliographically approved
Abelho Pereira, A. T., Marcinek, L., Miniotaitė, J., Thunberg, S., Lagerstedt, E., Gustafsson, J., . . . Irfan, B. (2024). Multimodal User Enjoyment Detection in Human-Robot Conversation: The Power of Large Language Models. In: : . Paper presented at 26th International Conference on Multimodal Interaction (ICMI), San Jose, USA, November 4-8, 2024 (pp. 469-478). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>Multimodal User Enjoyment Detection in Human-Robot Conversation: The Power of Large Language Models
Show others...
2024 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Enjoyment is a crucial yet complex indicator of positive user experience in Human-Robot Interaction (HRI). While manual enjoyment annotation is feasible, developing reliable automatic detection methods remains a challenge. This paper investigates a multimodal approach to automatic enjoyment annotation for HRI conversations, leveraging large language models (LLMs), visual, audio, and temporal cues. Our findings demonstrate that both text-only and multimodal LLMs with carefully designed prompts can achieve performance comparable to human annotators in detecting user enjoyment. Furthermore, results reveal a stronger alignment between LLM-based annotations and user self-reports of enjoyment compared to human annotators. While multimodal supervised learning techniques did not improve all of our performance metrics, they could successfully replicate human annotators and highlighted the importance of visual and audio cues in detecting subtle shifts in enjoyment. This research demonstrates the potential of LLMs for real-time enjoyment detection, paving the way for adaptive companion robots that can dynamically enhance user experiences.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2024
Keywords
Afect Recognition, Human-Robot Interaction, Large Language Models, Multimodal, Older Adults, User Enjoyment
National Category
Natural Language Processing
Identifiers
urn:nbn:se:kth:diva-359146 (URN)10.1145/3678957.3685729 (DOI)001433669800051 ()2-s2.0-85212589337 (Scopus ID)
Conference
26th International Conference on Multimodal Interaction (ICMI), San Jose, USA, November 4-8, 2024
Note

QC 20250127

Available from: 2025-01-27 Created: 2025-01-27 Last updated: 2025-04-30Bibliographically approved
Wozniak, M. K., Stower, R., Jensfelt, P. & Abelho Pereira, A. T. (2023). Happily Error After: Framework Development and User Study for Correcting Robot Perception Errors in Virtual Reality. In: 2023 32ND IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, RO-MAN: . Paper presented at 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), AUG 28-31, 2023, Busan, SOUTH KOREA (pp. 1573-1580). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Happily Error After: Framework Development and User Study for Correcting Robot Perception Errors in Virtual Reality
2023 (English)In: 2023 32ND IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, RO-MAN, Institute of Electrical and Electronics Engineers (IEEE) , 2023, p. 1573-1580Conference paper, Published paper (Refereed)
Abstract [en]

While we can see robots in more areas of our lives, they still make errors. One common cause of failure stems from the robot perception module when detecting objects. Allowing users to correct such errors can help improve the interaction and prevent the same errors in the future. Consequently, we investigate the effectiveness of a virtual reality (VR) framework for correcting perception errors of a Franka Panda robot. We conducted a user study with 56 participants who interacted with the robot using both VR and screen interfaces. Participants learned to collaborate with the robot faster in the VR interface compared to the screen interface. Additionally, participants found the VR interface more immersive, enjoyable, and expressed a preference for using it again. These findings suggest that VR interfaces may offer advantages over screen interfaces for human-robot interaction in erroneous environments.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2023
Series
IEEE RO-MAN, ISSN 1944-9445
National Category
Robotics and automation
Identifiers
urn:nbn:se:kth:diva-341975 (URN)10.1109/RO-MAN57019.2023.10309446 (DOI)001108678600198 ()2-s2.0-85186968933 (Scopus ID)
Conference
32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), AUG 28-31, 2023, Busan, SOUTH KOREA
Note

Part of proceedings ISBN 979-8-3503-3670-2

QC 20240110

Available from: 2024-01-10 Created: 2024-01-10 Last updated: 2025-02-09Bibliographically approved
Miniotaitė, J., Wang, S., Beskow, J., Gustafson, J., Székely, É. & Abelho Pereira, A. T. (2023). Hi robot, it's not what you say, it's how you say it. In: 2023 32ND IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, RO-MAN: . Paper presented at 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), AUG 28-31, 2023, Busan, SOUTH KOREA (pp. 307-314). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Hi robot, it's not what you say, it's how you say it
Show others...
2023 (English)In: 2023 32ND IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, RO-MAN, Institute of Electrical and Electronics Engineers (IEEE) , 2023, p. 307-314Conference paper, Published paper (Refereed)
Abstract [en]

Many robots use their voice to communicate with people in spoken language but the voices commonly used for robots are often optimized for transactional interactions, rather than social ones. This can limit their ability to create engaging and natural interactions. To address this issue, we designed a spontaneous text-to-speech tool and used it to author natural and spontaneous robot speech. A crowdsourcing evaluation methodology is proposed to compare this type of speech to natural speech and state-of-the-art text-to-speech technology, both in disembodied and embodied form. We created speech samples in a naturalistic setting of people playing tabletop games and conducted a user study evaluating Naturalness, Intelligibility, Social Impression, Prosody, and Perceived Intelligence. The speech samples were chosen to represent three contexts that are common in tabletopgames and the contexts were introduced to the participants that evaluated the speech samples. The study results show that the proposed evaluation methodology allowed for a robust analysis that successfully compared the different conditions. Moreover, the spontaneous voice met our target design goal of being perceived as more natural than a leading commercial text-to-speech.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2023
Series
IEEE RO-MAN, ISSN 1944-9445
Keywords
speech synthesis, human-robot interaction, embodiment, spontaneous speech, intelligibility, naturalness
National Category
Other Engineering and Technologies
Identifiers
urn:nbn:se:kth:diva-341972 (URN)10.1109/RO-MAN57019.2023.10309427 (DOI)001108678600044 ()2-s2.0-85186982397 (Scopus ID)
Conference
32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), AUG 28-31, 2023, Busan, SOUTH KOREA
Note

Part of proceedings ISBN 979-8-3503-3670-2

Available from: 2024-01-09 Created: 2024-01-09 Last updated: 2025-02-18Bibliographically approved
Rato, D., Correia, F., Abelho Pereira, A. T. & Prada, R. (2023). Robots in Games. International Journal of Social Robotics, 15(1), 37-57
Open this publication in new window or tab >>Robots in Games
2023 (English)In: International Journal of Social Robotics, ISSN 1875-4791, E-ISSN 1875-4805, Vol. 15, no 1, p. 37-57Article, review/survey (Refereed) Published
Abstract [en]

During the past two decades, robots have been increasingly deployed in games. Researchers use games to better understand human-robot interaction and, in turn, the inclusion of social robots during gameplay creates new opportunities for novel game experiences. The contributions from social robotics and games communities cover a large spectrum of research questions using a wide variety of scenarios. In this article, we present the first comprehensive survey of the deployment of robots in games. We organise our findings according to four dimensions: (1) the societal impact of robots in games, (2) games as a research platform, (3) social interactions in games, and (4) game scenarios and materials. We discuss some significant research achievements and potential research avenues for the gaming and social robotics communities. This article describes the state of the art of the research on robots in games in the hope that it will assist researchers to contextualise their work in the field, to adhere to best practices and to identify future areas of research and multidisciplinary collaboration.

Place, publisher, year, edition, pages
Springer Nature, 2023
Keywords
Games, Human–robot interaction, Review, Social robotics
National Category
Human Computer Interaction Robotics and automation
Identifiers
urn:nbn:se:kth:diva-329090 (URN)10.1007/s12369-022-00944-4 (DOI)000898635300001 ()2-s2.0-85143900315 (Scopus ID)
Note

QC 20230615

Available from: 2023-06-15 Created: 2023-06-15 Last updated: 2025-02-05Bibliographically approved
Wozniak, M. K., Stower, R., Jensfelt, P. & Abelho Pereira, A. T. (2023). What You See Is (not) What You Get: A VR Framework For Correcting Robot Errors. In: HRI 2023: Companion of the ACM/IEEE International Conference on Human-Robot Interaction. Paper presented at 18th Annual ACM/IEEE International Conference on Human-Robot Interaction, HRI 2023, Stockholm, Sweden, Mar 13 2023 - Mar 16 2023 (pp. 243-247). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>What You See Is (not) What You Get: A VR Framework For Correcting Robot Errors
2023 (English)In: HRI 2023: Companion of the ACM/IEEE International Conference on Human-Robot Interaction, Association for Computing Machinery (ACM) , 2023, p. 243-247Conference paper, Published paper (Refereed)
Abstract [en]

Many solutions tailored for intuitive visualization or teleoperation of virtual, augmented and mixed (VAM) reality systems are not robust to robot failures, such as the inability to detect and recognize objects in the environment or planning unsafe trajectories. In this paper, we present a novel virtual reality (VR) framework where users can (i) recognize when the robot has failed to detect a realworld object, (ii) correct the error in VR, (iii) modify proposed object trajectories and, (iv) implement behaviors on a real-world robot. Finally, we propose a user study aimed at testing the efficacy of our framework. Project materials can be found in the OSF repository1.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2023
Keywords
AR, human-robot interaction, perception, robotics, VR
National Category
Robotics and automation Computer graphics and computer vision
Identifiers
urn:nbn:se:kth:diva-333372 (URN)10.1145/3568294.3580081 (DOI)001054975700044 ()2-s2.0-85150432457 (Scopus ID)
Conference
18th Annual ACM/IEEE International Conference on Human-Robot Interaction, HRI 2023, Stockholm, Sweden, Mar 13 2023 - Mar 16 2023
Note

Part of ISBN 9781450399708

QC 20230801

Available from: 2023-08-01 Created: 2023-08-01 Last updated: 2025-02-05Bibliographically approved
He, Y., Abelho Pereira, A. T. & Kucherenko, T. (2022). Evaluating data-driven co-speech gestures of embodied conversational agents through real-time interaction. In: IVA '22: Proceedings of the 22nd ACM International Conference on Intelligent Virtual Agents: . Paper presented at IVA 2022 - Proceedings of the 22nd ACM International Conference on Intelligent Virtual Agents. Association for Computing Machinery (ACM)
Open this publication in new window or tab >>Evaluating data-driven co-speech gestures of embodied conversational agents through real-time interaction
2022 (English)In: IVA '22: Proceedings of the 22nd ACM International Conference on Intelligent Virtual Agents, Association for Computing Machinery (ACM) , 2022Conference paper, Published paper (Refereed)
Abstract [en]

Embodied Conversational Agents (ECAs) that make use of co-speech gestures can enhance human-machine interactions in many ways. In recent years, data-driven gesture generation approaches for ECAs have attracted considerable research attention, and related methods have continuously improved. Real-time interaction is typically used when researchers evaluate ECA systems that generate rule-based gestures. However, when evaluating the performance of ECAs based on data-driven methods, participants are often required only to watch pre-recorded videos, which cannot provide adequate information about what a person perceives during the interaction. To address this limitation, we explored use of real-time interaction to assess data-driven gesturing ECAs. We provided a testbed framework, and investigated whether gestures could affect human perception of ECAs in the dimensions of human-likeness, animacy, perceived intelligence, and focused attention. Our user study required participants to interact with two ECAs - one with and one without hand gestures. We collected subjective data from the participants' self-report questionnaires and objective data from a gaze tracker. To our knowledge, the current study represents the first attempt to evaluate data-driven gesturing ECAs through real-time interaction and the first experiment using gaze-tracking to examine the effect of ECAs' gestures.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2022
Keywords
data-driven, embodied conversational agent, evaluation instrument, gaze tracking, gesture generation, user study, Interactive computer systems, Real time systems, Surveys, User interfaces, Agent systems, Data driven, Gaze-tracking, Human machine interaction, Real time interactions, Rule based, Eye tracking
National Category
Human Computer Interaction
Identifiers
urn:nbn:se:kth:diva-327286 (URN)10.1145/3514197.3549697 (DOI)2-s2.0-85138695641 (Scopus ID)
Conference
IVA 2022 - Proceedings of the 22nd ACM International Conference on Intelligent Virtual Agents
Note

QC 20230524

Available from: 2023-05-24 Created: 2023-05-24 Last updated: 2023-05-24Bibliographically approved
Laban, G., Le Maguer, S., Lee, M., Kontogiorgos, D., Reig, S., Torre, I., . . . Pereira, A. (2022). Robo-Identity: Exploring Artificial Identity and Emotion via Speech Interactions. In: PROCEEDINGS OF THE 2022 17TH ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION (HRI '22): . Paper presented at 17th Annual ACM/IEEE International Conference on Human-Robot Interaction (HRI), MAR 07-10, 2022, ELECTR NETWORK (pp. 1265-1268). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Robo-Identity: Exploring Artificial Identity and Emotion via Speech Interactions
Show others...
2022 (English)In: PROCEEDINGS OF THE 2022 17TH ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION (HRI '22), Institute of Electrical and Electronics Engineers (IEEE) , 2022, p. 1265-1268Conference paper, Published paper (Refereed)
Abstract [en]

Following the success of the first edition of Robo-Identity, the second edition will provide an opportunity to expand the discussion about artificial identity. This year, we are focusing on emotions that are expressed through speech and voice. Synthetic voices of robots can resemble and are becoming indistinguishable from expressive human voices. This can be an opportunity and a constraint in expressing emotional speech that can (falsely) convey a human-like identity that can mislead people, leading to ethical issues. How should we envision an agent's artificial identity? In what ways should we have robots that maintain a machine-like stance, e.g., through robotic speech, and should emotional expressions that are increasingly human-like be seen as design opportunities? These are not mutually exclusive concerns. As this discussion needs to be conducted in a multidisciplinary manner, we welcome perspectives on challenges and opportunities from variety of fields. For this year's edition, the special theme will be "speech, emotion and artificial identity".

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2022
Series
ACM IEEE International Conference on Human-Robot Interaction, ISSN 2167-2121
Keywords
artificial identity, voice, speech, emotion, affective computing, human-robot interaction, affective science
National Category
Human Computer Interaction Computer graphics and computer vision
Identifiers
urn:nbn:se:kth:diva-322490 (URN)10.1109/HRI53351.2022.9889649 (DOI)000869793600219 ()2-s2.0-85140715625 (Scopus ID)
Conference
17th Annual ACM/IEEE International Conference on Human-Robot Interaction (HRI), MAR 07-10, 2022, ELECTR NETWORK
Note

Part of proceedings: ISBN 978-1-6654-0731-1

QC 20221216

Available from: 2022-12-16 Created: 2022-12-16 Last updated: 2025-02-01Bibliographically approved
Nagy, R., Kucherenko, T., Moell, B., Abelho Pereira, A. T., Kjellström, H. & Bernardet, U. (2021). A Framework for Integrating Gesture Generation Models into Interactive Conversational Agents. In: : . Paper presented at 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS)..
Open this publication in new window or tab >>A Framework for Integrating Gesture Generation Models into Interactive Conversational Agents
Show others...
2021 (English)Conference paper, Oral presentation with published abstract (Refereed)
Abstract [en]

Embodied conversational agents (ECAs) benefit from non-verbal behavior for natural and efficient interaction with users. Gesticulation – hand and arm movements accompanying speech – is an essential part of non-verbal behavior. Gesture generation models have been developed for several decades: starting with rule-based and ending with mainly data-driven methods. To date, recent end-to-end gesture generation methods have not been evaluated in areal-time interaction with users. We present a proof-of-concept

framework, which is intended to facilitate evaluation of modern gesture generation models in interaction. We demonstrate an extensible open-source framework that contains three components: 1) a 3D interactive agent; 2) a chatbot back-end; 3) a gesticulating system. Each component can be replaced,

making the proposed framework applicable for investigating the effect of different gesturing models in real-time interactions with different communication modalities, chatbot backends, or different agent appearances. The code and video are available at the project page https://nagyrajmund.github.io/project/gesturebot.

Keywords
conversational embodied agents; non-verbal behavior synthesis
National Category
Human Computer Interaction
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-304616 (URN)
Conference
20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS).
Funder
Swedish Foundation for Strategic Research, RIT15-0107
Note

QC 20211130

Not duplicate with DiVA 1653872

Available from: 2021-11-08 Created: 2021-11-08 Last updated: 2022-06-25Bibliographically approved
Nagy, R., Kucherenko, T., Moell, B., Abelho Pereira, A. T., Kjellström, H. & Bernardet, U. (2021). A framework for integrating gesture generation models into interactive conversational agents. In: Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS: . Paper presented at 20th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2021, 3 May 2021 through 7 May 2021 (pp. 1767-1769). International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS)
Open this publication in new window or tab >>A framework for integrating gesture generation models into interactive conversational agents
Show others...
2021 (English)In: Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS, International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS) , 2021, p. 1767-1769Conference paper, Published paper (Refereed)
Abstract [en]

Embodied conversational agents (ECAs) benefit from non-verbal behavior for natural and efficient interaction with users. Gesticulation - hand and arm movements accompanying speech - is an essential part of non-verbal behavior. Gesture generation models have been developed for several decades: starting with rule-based and ending with mainly data-driven methods. To date, recent end to- end gesture generation methods have not been evaluated in a real-time interaction with users. We present a proof-of-concept framework, which is intended to facilitate evaluation of modern gesture generation models in interaction. We demonstrate an extensible open-source framework that contains three components: 1) a 3D interactive agent; 2) a chatbot backend; 3) a gesticulating system. Each component can be replaced, making the proposed framework applicable for investigating the effect of different gesturing models in real-time interactions with different communication modalities, chatbot backends, or different agent appearances. The code and video are available at the project page https://nagyrajmund.github.io/project/gesturebot. 

Place, publisher, year, edition, pages
International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS), 2021
Keywords
Conversational embodied agents, Non-verbal behavior synthesis, Multi agent systems, Open systems, Speech, Communication modalities, Conversational agents, Data-driven methods, Efficient interaction, Embodied conversational agent, Interactive agents, Open source frameworks, Real time interactions, Autonomous agents
National Category
Human Computer Interaction Computer Sciences
Identifiers
urn:nbn:se:kth:diva-311130 (URN)2-s2.0-85112311041 (Scopus ID)
Conference
20th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2021, 3 May 2021 through 7 May 2021
Note

Part of proceedings: ISBN 978-1-7138-3262-1

QC 20220425

Available from: 2022-04-25 Created: 2022-04-25 Last updated: 2023-01-17Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0003-2428-0468

Search in DiVA

Show all publications