kth.sePublications
Change search
Link to record
Permanent link

Direct link
Publications (10 of 26) Show all publications
Jonell, P. (2022). Scalable Methods for Developing Interlocutor-aware Embodied Conversational Agents: Data Collection, Behavior Modeling, and Evaluation Methods. (Doctoral dissertation). KTH Royal Institute of Technology
Open this publication in new window or tab >>Scalable Methods for Developing Interlocutor-aware Embodied Conversational Agents: Data Collection, Behavior Modeling, and Evaluation Methods
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

This work presents several methods, tools, and experiments that contribute to the development of interlocutor-aware Embodied Conversational Agents (ECAs). Interlocutor-aware ECAs take the interlocutor's behavior into consideration when generating their own non-verbal behaviors. This thesis targets the development of such adaptive ECAs by identifying and contributing to three important and related topics:

1) Data collection methods are presented, both for large scale crowdsourced data collection and in-lab data collection with a large number of sensors in a clinical setting. Experiments show that experts deemed dialog data collected using a crowdsourcing method to be better for dialog generation purposes than dialog data from other commonly used sources. 2) Methods for behavior modeling are presented, where machine learning models are used to generate facial gestures for ECAs. Both methods for single speaker and interlocutor-aware generation are presented. 3) Evaluation methods are explored and both third-party evaluation of generated gestures and interaction experiments of interlocutor-aware gestures generation are being discussed. For example, an experiment is carried out investigating the social influence of a mimicking social robot. Furthermore, a method for more efficient perceptual experiments is presented. This method is validated by replicating a previously conducted perceptual experiment on virtual agents, and shows that the results obtained using this new method provide similar insights (in fact, it provided more insights) into the data, simultaneously being more efficient in terms of time evaluators needed to spend participating in the experiment. A second study compared the difference between performing subjective evaluations of generated gestures in the lab vs. using crowdsourcing, and showed no difference between the two settings. A special focus in this thesis is given to using scalable methods, which allows for being able to efficiently and rapidly collect interaction data from a broad range of people and efficiently evaluate results produced by the machine learning methods. This in turn allows for fast iteration when developing interlocutor-aware ECAs behaviors.

Abstract [sv]

Det här arbetet presenterar ett flertal metoder, verktyg och experiment som alla bidrar till utvecklingen av motparts-medvetna förkloppsligade konversationella agenter, dvs agenter som kommunicerar med språk, har en kroppslig representation (avatar eller robot) och tar motpartens beteenden i beaktande när de genererar sina egna icke-verbala beteenden. Den här avhandlingen ämnar till att bidra till utvecklingen av sådana agenter genom att identifiera och bidra till tre viktiga områden:

Datainstamlingsmetoder  både för storskalig datainsamling med hjälp av så kallade "crowdworkers" (en stor mängd personer på internet som används för att lösa ett problem) men även i laboratoriemiljö med ett stort antal sensorer. Experiment presenteras som visar att t.ex. dialogdata som samlats in med hjälp av crowdworkers är bedömda som bättre ur dialoggenereringspersiktiv av en grupp experter än andra vanligt använda datamängder som används inom dialoggenerering. 2) Metoder för beteendemodellering, där maskininlärningsmodeller används för att generera ansiktsgester. Såväl metoder för att generera ansiktsgester för en ensam agent och för motparts-medvetna agenter presenteras, tillsammans med experiment som validerar deras funktionalitet. Vidare presenteras även ett experiment som undersöker en agents sociala påverkan på sin motpart då den imiterar ansiktsgester hos motparten medan de samtalar. 3) Evalueringsmetoder är utforskade och en metod för mer effektiva perceptuella experiment presenteras. Metoden är utvärderad genom att återskapa ett tidigare genomfört experiment med virtuella agenter, och visar att resultaten som fås med denna nya metod ger liknande insikter (den ger faktiskt fler insikter), samtidigt som den är effektivare när det kommer till hur mycket tid utvärderarna behövde spendera. En andra studie studerar skillnaden mellan att utföra subjektiva utvärderingar av genererade gester i en laboratoriemiljö jämfört med att använda crowdworkers, och visade att ingen skillnad kunde uppmätas. Ett speciellt fokus ligger på att använda skalbara metoder, då detta möjliggör effektiv och snabb insamling av mångfasetterad interaktionsdata från många olika människor samt evaluaring av de beteenden som genereras från maskininlärningsmodellerna, vilket i sin tur möjliggör snabb iterering i utvecklingen.

Place, publisher, year, edition, pages
KTH Royal Institute of Technology, 2022. p. 77
Series
TRITA-EECS-AVL ; 2022:15
Keywords
non-verbal behavior generation, interlocutor-aware, data collection, behavior modeling, evaluation methods
National Category
Computer Systems
Research subject
Speech and Music Communication
Identifiers
urn:nbn:se:kth:diva-309467 (URN)978-91-8040-151-7 (ISBN)
Public defence
2022-03-25, U1, https://kth-se.zoom.us/j/62813774919, Brinellvägen 26, Stockholm, 14:00 (English)
Opponent
Supervisors
Note

QC 20220307

Available from: 2022-03-07 Created: 2022-03-03 Last updated: 2022-06-25Bibliographically approved
Kucherenko, T., Jonell, P., Yoon, Y., Wolfert, P. & Henter, G. E. (2021). A large, crowdsourced evaluation of gesture generation systems on common data: The GENEA Challenge 2020. In: Proceedings IUI '21: 26th International Conference on Intelligent User Interfaces: . Paper presented at IUI '21: 26th International Conference on Intelligent User Interfaces, College Station, TX, USA, April 13-17, 2021 (pp. 11-21). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>A large, crowdsourced evaluation of gesture generation systems on common data: The GENEA Challenge 2020
Show others...
2021 (English)In: Proceedings IUI '21: 26th International Conference on Intelligent User Interfaces, Association for Computing Machinery (ACM) , 2021, p. 11-21Conference paper, Published paper (Refereed)
Abstract [en]

Co-speech gestures, gestures that accompany speech, play an important role in human communication. Automatic co-speech gesture generation is thus a key enabling technology for embodied conversational agents (ECAs), since humans expect ECAs to be capable of multi-modal communication. Research into gesture generation is rapidly gravitating towards data-driven methods. Unfortunately, individual research efforts in the field are difficult to compare: there are no established benchmarks, and each study tends to use its own dataset, motion visualisation, and evaluation methodology. To address this situation, we launched the GENEA Challenge, a gesture-generation challenge wherein participating teams built automatic gesture-generation systems on a common dataset, and the resulting systems were evaluated in parallel in a large, crowdsourced user study using the same motion-rendering pipeline. Since differences in evaluation outcomes between systems now are solely attributable to differences between the motion-generation methods, this enables benchmarking recent approaches against one another in order to get a better impression of the state of the art in the field. This paper reports on the purpose, design, results, and implications of our challenge.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2021
Keywords
gesture generation, conversational agents, evaluation paradigms
National Category
Human Computer Interaction
Research subject
Human-computer Interaction
Identifiers
urn:nbn:se:kth:diva-296490 (URN)10.1145/3397481.3450692 (DOI)000747690200006 ()2-s2.0-85102546745 (Scopus ID)
Conference
IUI '21: 26th International Conference on Intelligent User Interfaces, College Station, TX, USA, April 13-17, 2021
Funder
Swedish Foundation for Strategic Research , RIT15-0107Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

Part of Proceedings: ISBN 978-145038017-1

QC 20220303

Available from: 2021-06-05 Created: 2021-06-05 Last updated: 2022-06-25Bibliographically approved
Kucherenko, T., Jonell, P., Yoon, Y., Wolfert, P., Yumak, Z. & Henter, G. E. (2021). GENEA Workshop 2021: The 2nd Workshop on Generation and Evaluation of Non-verbal Behaviour for Embodied Agents. In: Proceedings of ICMI '21: International Conference on Multimodal Interaction: . Paper presented at ICMI '21: International Conference on Multimodal Interaction, Montréal, QC, Canada, October 18-22, 2021 (pp. 872-873). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>GENEA Workshop 2021: The 2nd Workshop on Generation and Evaluation of Non-verbal Behaviour for Embodied Agents
Show others...
2021 (English)In: Proceedings of ICMI '21: International Conference on Multimodal Interaction, Association for Computing Machinery (ACM) , 2021, p. 872-873Conference paper, Published paper (Refereed)
Abstract [en]

Embodied agents benefit from using non-verbal behavior when communicating with humans. Despite several decades of non-verbal behavior-generation research, there is currently no well-developed benchmarking culture in the field. For example, most researchers do not compare their outcomes with previous work, and if they do, they often do so in their own way which frequently is incompatible with others. With the GENEA Workshop 2021, we aim to bring the community together to discuss key challenges and solutions, and find the most appropriate ways to move the field forward.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2021
Keywords
behavior synthesis, datasets, evaluation, gesture generation, Behavior generation, Dataset, Embodied agent, Non-verbal behaviours, Behavioral research
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-313185 (URN)10.1145/3462244.3480983 (DOI)2-s2.0-85118969127 (Scopus ID)
Conference
ICMI '21: International Conference on Multimodal Interaction, Montréal, QC, Canada, October 18-22, 2021
Note

Part of proceedings ISBN 9781450384810

QC 20220602

Available from: 2022-06-02 Created: 2022-06-02 Last updated: 2022-06-25Bibliographically approved
Jonell, P., Yoon, Y., Wolfert, P., Kucherenko, T. & Henter, G. E. (2021). HEMVIP: Human Evaluation of Multiple Videos in Parallel. In: ICMI '21: Proceedings of the 2021 International Conference on Multimodal Interaction: . Paper presented at International Conference on Multimodal Interaction Montreal, Canada. October 18-22nd, 2021 (pp. 707-711). New York, NY, United States: Association for Computing Machinery (ACM)
Open this publication in new window or tab >>HEMVIP: Human Evaluation of Multiple Videos in Parallel
Show others...
2021 (English)In: ICMI '21: Proceedings of the 2021 International Conference on Multimodal Interaction, New York, NY, United States: Association for Computing Machinery (ACM) , 2021, p. 707-711Conference paper, Poster (with or without abstract) (Refereed)
Abstract [en]

In many research areas, for example motion and gesture generation, objective measures alone do not provide an accurate impression of key stimulus traits such as perceived quality or appropriateness. The gold standard is instead to evaluate these aspects through user studies, especially subjective evaluations of video stimuli. Common evaluation paradigms either present individual stimuli to be scored on Likert-type scales, or ask users to compare and rate videos in a pairwise fashion. However, the time and resources required for such evaluations scale poorly as the number of conditions to be compared increases. Building on standards used for evaluating the quality of multimedia codecs, this paper instead introduces a framework for granular rating of multiple comparable videos in parallel. This methodology essentially analyses all condition pairs at once. Our contributions are 1) a proposed framework, called HEMVIP, for parallel and granular evaluation of multiple video stimuli and 2) a validation study confirming that results obtained using the tool are in close agreement with results of prior studies using conventional multiple pairwise comparisons.

Place, publisher, year, edition, pages
New York, NY, United States: Association for Computing Machinery (ACM), 2021
Keywords
evaluation paradigms, video evaluation, conversational agents, gesture generation
National Category
Computer Systems
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-309462 (URN)10.1145/3462244.3479957 (DOI)2-s2.0-85113672097 (Scopus ID)
Conference
International Conference on Multimodal Interaction Montreal, Canada. October 18-22nd, 2021
Funder
Swedish Foundation for Strategic Research, RIT15-0107Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

Part of proceedings: ISBN 978-1-4503-8481-0

QC 20220309

Available from: 2022-03-03 Created: 2022-03-03 Last updated: 2023-01-18Bibliographically approved
Jonell, P., Deichler, A., Torre, I., Leite, I. & Beskow, J. (2021). Mechanical Chameleons: Evaluating the effects of a social robot’snon-verbal behavior on social influence. In: Proceedings of SCRITA 2021, a workshop at IEEE RO-MAN 2021: . Paper presented at Trust, Acceptance and Social Cues in Human-Robot Interaction - SCRITA, 12 August, 2021.
Open this publication in new window or tab >>Mechanical Chameleons: Evaluating the effects of a social robot’snon-verbal behavior on social influence
Show others...
2021 (English)In: Proceedings of SCRITA 2021, a workshop at IEEE RO-MAN 2021, 2021Conference paper, Published paper (Refereed)
Abstract [en]

In this paper we present a pilot study which investigates how non-verbal behavior affects social influence in social robots. We also present a modular system which is capable of controlling the non-verbal behavior based on the interlocutor's facial gestures (head movements and facial expressions) in real time, and a study investigating whether three different strategies for facial gestures ("still", "natural movement", i.e. movements recorded from another conversation, and "copy", i.e. mimicking the user with a four second delay) has any affect on social influence and decision making in a "survival task". Our preliminary results show there was no significant difference between the three conditions, but this might be due to among other things a low number of study participants (12). 

National Category
Computer Systems
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-309464 (URN)
Conference
Trust, Acceptance and Social Cues in Human-Robot Interaction - SCRITA, 12 August, 2021
Funder
Swedish Foundation for Strategic Research , RIT15-0107Swedish Research Council, 2018-05409
Note

QC 20220308

Available from: 2022-03-03 Created: 2022-03-03 Last updated: 2022-06-25Bibliographically approved
Jonell, P., Moell, B., Håkansson, K., Henter, G. E., Kucherenko, T., Mikheeva, O., . . . Beskow, J. (2021). Multimodal Capture of Patient Behaviour for Improved Detection of Early Dementia: Clinical Feasibility and Preliminary Results. Frontiers in Computer Science, 3, Article ID 642633.
Open this publication in new window or tab >>Multimodal Capture of Patient Behaviour for Improved Detection of Early Dementia: Clinical Feasibility and Preliminary Results
Show others...
2021 (English)In: Frontiers in Computer Science, E-ISSN 2624-9898, Vol. 3, article id 642633Article in journal (Refereed) Published
Abstract [en]

Non-invasive automatic screening for Alzheimer's disease has the potential to improve diagnostic accuracy while lowering healthcare costs. Previous research has shown that patterns in speech, language, gaze, and drawing can help detect early signs of cognitive decline. In this paper, we describe a highly multimodal system for unobtrusively capturing data during real clinical interviews conducted as part of cognitive assessments for Alzheimer's disease. The system uses nine different sensor devices (smartphones, a tablet, an eye tracker, a microphone array, and a wristband) to record interaction data during a specialist's first clinical interview with a patient, and is currently in use at Karolinska University Hospital in Stockholm, Sweden. Furthermore, complementary information in the form of brain imaging, psychological tests, speech therapist assessment, and clinical meta-data is also available for each patient. We detail our data-collection and analysis procedure and present preliminary findings that relate measures extracted from the multimodal recordings to clinical assessments and established biomarkers, based on data from 25 patients gathered thus far. Our findings demonstrate feasibility for our proposed methodology and indicate that the collected data can be used to improve clinical assessments of early dementia.

Place, publisher, year, edition, pages
Frontiers Media SA, 2021
Keywords
Alzheimer, mild cognitive impairment, multimodal prediction, speech, gaze, pupil dilation, thermal camera, pen motion
National Category
Natural Language Processing
Identifiers
urn:nbn:se:kth:diva-303883 (URN)10.3389/fcomp.2021.642633 (DOI)000705498300001 ()2-s2.0-85115692731 (Scopus ID)
Note

QC 20211022

Available from: 2021-10-22 Created: 2021-10-22 Last updated: 2025-02-07Bibliographically approved
Kucherenko, T., Nagy, R., Jonell, P., Neff, M., Kjellström, H. & Henter, G. E. (2021). Speech2Properties2Gestures: Gesture-Property Prediction as a Tool for Generating Representational Gestures from Speech. In: IVA '21: Proceedings of the 21st ACM International Conference on Intelligent Virtual Agents. Paper presented at 21st ACM International Conference on Intelligent Virtual Agents, IVA 2021Virtual, Online14 September 2021 through 17 September 2021, University of Fukuchiyama, Fukuchiyama City, Kyoto, Japan (pp. 145-147). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>Speech2Properties2Gestures: Gesture-Property Prediction as a Tool for Generating Representational Gestures from Speech
Show others...
2021 (English)In: IVA '21: Proceedings of the 21st ACM International Conference on Intelligent Virtual Agents, Association for Computing Machinery (ACM) , 2021, p. 145-147Conference paper, Published paper (Refereed)
Abstract [en]

We propose a new framework for gesture generation, aiming to allow data-driven approaches to produce more semantically rich gestures. Our approach first predicts whether to gesture, followed by a prediction of the gesture properties. Those properties are then used as conditioning for a modern probabilistic gesture-generation model capable of high-quality output. This empowers the approach to generate gestures that are both diverse and representational. Follow-ups and more information can be found on the project page:https://svito-zar.github.io/speech2properties2gestures

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2021
Keywords
gesture generation, virtual agents, representational gestures
National Category
Human Computer Interaction
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-302667 (URN)10.1145/3472306.3478333 (DOI)000728149900023 ()2-s2.0-85113524837 (Scopus ID)
Conference
21st ACM International Conference on Intelligent Virtual Agents, IVA 2021Virtual, Online14 September 2021 through 17 September 2021, University of Fukuchiyama, Fukuchiyama City, Kyoto, Japan
Funder
Swedish Foundation for Strategic Research , RIT15-0107Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

QC 20211102

Part of Proceedings: ISBN 9781450386197

Available from: 2021-09-28 Created: 2021-09-28 Last updated: 2022-06-25Bibliographically approved
Oertel, C., Jonell, P., Kontogiorgos, D., Mora, K. F., Odobez, J.-M. & Gustafsson, J. (2021). Towards an Engagement-Aware Attentive Artificial Listener for Multi-Party Interactions. Frontiers in Robotics and AI, 8, Article ID 555913.
Open this publication in new window or tab >>Towards an Engagement-Aware Attentive Artificial Listener for Multi-Party Interactions
Show others...
2021 (English)In: Frontiers in Robotics and AI, E-ISSN 2296-9144, Vol. 8, article id 555913Article in journal (Refereed) Published
Abstract [en]

Listening to one another is essential to human-human interaction. In fact, we humans spend a substantial part of our day listening to other people, in private as well as in work settings. Attentive listening serves the function to gather information for oneself, but at the same time, it also signals to the speaker that he/she is being heard. To deduce whether our interlocutor is listening to us, we are relying on reading his/her nonverbal cues, very much like how we also use non-verbal cues to signal our attention. Such signaling becomes more complex when we move from dyadic to multi-party interactions. Understanding how humans use nonverbal cues in a multi-party listening context not only increases our understanding of human-human communication but also aids the development of successful human-robot interactions. This paper aims to bring together previous analyses of listener behavior analyses in human-human multi-party interaction and provide novel insights into gaze patterns between the listeners in particular. We are investigating whether the gaze patterns and feedback behavior, as observed in the humanhuman dialogue, are also beneficial for the perception of a robot in multi-party humanrobot interaction. To answer this question, we are implementing an attentive listening system that generates multi-modal listening behavior based on our human-human analysis. We are comparing our system to a baseline system that does not differentiate between different listener types in its behavior generation. We are evaluating it in terms of the participant's perception of the robot, his behavior as well as the perception of third-party observers.

Place, publisher, year, edition, pages
Frontiers Media SA, 2021
Keywords
multi-party interactions, non-verbal behaviors, eye-gaze patterns, head gestures, human-robot interaction, artificial listener, social signal processing
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-299298 (URN)10.3389/frobt.2021.555913 (DOI)000673604300001 ()34277714 (PubMedID)2-s2.0-85110106028 (Scopus ID)
Note

QC 20220301

Available from: 2021-08-18 Created: 2021-08-18 Last updated: 2022-06-25Bibliographically approved
Jonell, P., Kucherenko, T., Torre, I. & Beskow, J. (2020). Can we trust online crowdworkers? : Comparing online and offline participants in a preference test of virtual agents.. In: IVA '20: Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents: . Paper presented at IVA '20: ACM International Conference on Intelligent Virtual Agents, Virtual Event, Scotland, UK, October 20-22, 2020. Association for Computing Machinery (ACM)
Open this publication in new window or tab >>Can we trust online crowdworkers? : Comparing online and offline participants in a preference test of virtual agents.
2020 (English)In: IVA '20: Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents, Association for Computing Machinery (ACM) , 2020Conference paper, Published paper (Refereed)
Abstract [en]

Conducting user studies is a crucial component in many scientific fields. While some studies require participants to be physically present, other studies can be conducted both physically (e.g. in-lab)and online (e.g. via crowdsourcing). Inviting participants to the lab can be a time-consuming and logistically difficult endeavor, not to mention that sometimes research groups might not be able to run in-lab experiments, because of, for example, a pandemic. Crowd-sourcing platforms such as Amazon Mechanical Turk (AMT) or prolific can therefore be a suitable alternative to run certain experiments, such as evaluating virtual agents. Although previous studies investigated the use of crowdsourcing platforms for running experiments, there is still uncertainty as to whether the results are reliable for perceptual studies. Here we replicate a previous experiment where participants evaluated a gesture generation model for virtual agents. The experiment is conducted across three participant poolsś in-lab, Prolific, andAMTś having similar demographics across the in-lab participants and the Prolific platform. Our results show no difference between the three participant pools in regards to their evaluations of the gesture generation models and their reliability scores. The results indicate that online platforms can successfully be used for perceptual evaluations of this kind.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2020
Keywords
user studies, online participants, attentiveness
National Category
Human Computer Interaction
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-290562 (URN)10.1145/3383652.3423860 (DOI)000728153600002 ()2-s2.0-85096979963 (Scopus ID)
Conference
IVA '20: ACM International Conference on Intelligent Virtual Agents, Virtual Event, Scotland, UK, October 20-22, 2020
Funder
Swedish Foundation for Strategic Research , RIT15-0107Wallenberg AI, Autonomous Systems and Software Program (WASP), CorSA
Note

OQ 20211109

Part of Proceedings: ISBN 978-145037586-3

Taras Kucherenko and Patrik Jonell contributed equally to this research.

Available from: 2021-02-18 Created: 2021-02-18 Last updated: 2022-06-25Bibliographically approved
Cohn, M., Jonell, P., Kim, T., Beskow, J. & Zellou, G. (2020). Embodiment and gender interact in alignment to TTS voices. In: Proceedings for the 42nd Annual Meeting of the Cognitive Science Society: Developing a Mind: Learning in Humans, Animals, and Machines, CogSci 2020. Paper presented at 42nd Annual Meeting of the Cognitive Science Society: Developing a Mind: Learning in Humans, Animals, and Machines, CogSci 2020, 29 July 2020 through 1 August 2020 (pp. 220-226). The Cognitive Science Society
Open this publication in new window or tab >>Embodiment and gender interact in alignment to TTS voices
Show others...
2020 (English)In: Proceedings for the 42nd Annual Meeting of the Cognitive Science Society: Developing a Mind: Learning in Humans, Animals, and Machines, CogSci 2020, The Cognitive Science Society , 2020, p. 220-226Conference paper, Published paper (Refereed)
Abstract [en]

The current study tests subjects' vocal alignment toward female and male text-to-speech (TTS) voices presented via three systems: Amazon Echo, Nao, and Furhat. These systems vary in their physical form, ranging from a cylindrical speaker (Echo), to a small robot (Nao), to a human-like robot bust (Furhat). We test whether this cline of personification (cylinder < mini robot < human-like robot bust) predicts patterns of gender-mediated vocal alignment. In addition to comparing multiple systems, this study addresses a confound in many prior vocal alignment studies by using identical voices across the systems. Results show evidence for a cline of personification toward female TTS voices by female shadowers (Echo < Nao < Furhat) and a more categorical effect of device personification for male TTS voices by male shadowers (Echo < Nao, Furhat). These findings are discussed in terms of their implications for models of device-human interaction and theories of computer personification. 

Place, publisher, year, edition, pages
The Cognitive Science Society, 2020
Keywords
embodiment, gender, human-device interaction, text-to-speech, vocal alignment, Human computer interaction, Human robot interaction, 'current, Human like robots, Minirobots, Robot Nao, Small robots, Text to speech, Alignment
National Category
Natural Language Processing
Identifiers
urn:nbn:se:kth:diva-323807 (URN)2-s2.0-85129583974 (Scopus ID)
Conference
42nd Annual Meeting of the Cognitive Science Society: Developing a Mind: Learning in Humans, Animals, and Machines, CogSci 2020, 29 July 2020 through 1 August 2020
Note

QC 20230213

Available from: 2023-02-13 Created: 2023-02-13 Last updated: 2025-02-07Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0003-3687-6189

Search in DiVA

Show all publications