kth.sePublications
Change search
Refine search result
1 - 18 of 18
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Cumbal, Ronald
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Adaptive Robot Discourse for Language Acquisition in Adulthood2022In: Proceedings of the 2022 ACM/IEEE International Conference on Human-Robot Interaction, Institute of Electrical and Electronics Engineers (IEEE) , 2022, p. 1158-1160Conference paper (Refereed)
    Abstract [en]

    Acquiring a second language in adulthood differs considerably from the approach taken at younger ages. Learning rates tend to decrease during adolescence, and socio-emotional characteristics, like motivation and expectations, take a different perspective for adults. In particular, acquiring communicative competence is a stronger objective for older learners, as an appropriate use of language in social contexts ensures a better community immersion and well-being. This skill is best attained through interactions with proficient speakers, but if this option is not available, social robots present a good alternative for this purpose. However, to obtain optimal results, a robot companion should adapt to the learner's proficiency level and motivation continuously to encourage speech production and increase fluency. Our work attempts to achieve this goal by developing an adaptive robot that modifies its spoken dialogue strategy, and visual feedback, to reflect a student's knowledge, proficiency and engagement levels in situated interactions for long-term learning.

    Download full text (pdf)
    fulltext
  • 2.
    Cumbal, Ronald
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Robots Beyond Borders: The Role of Social Robots in Spoken Second Language Practice2024Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    This thesis investigates how social robots can support adult second language (L2) learners in improving conversational skills. It recognizes the challenges inherent in adult L2 learning, including increased cognitive demands and the unique motivations driving adult education. While social robots hold potential for natural interactions and language education, research into conversational skill practice with adult learners remains underexplored. Thus, the thesis contributes to understanding these conversational dynamics, enhancing speaking practice, and examining cultural perspectives in this context.

    To begin, this thesis investigates robot-led conversations with L2 learners, examining how learners respond to moments of uncertainty. The research reveals that when faced with uncertainty, learners frequently seek clarification, yet many remain unresponsive. As a result, effective strategies are required from robot conversational partners to address this challenge. These interactions are then used to evaluate the performance of off-the-shelf Automatic Speech Recognition (ASR) systems. The assessment highlights that speech recognition for L2 speakers is not as effective as for L1 speakers, with performance deteriorating for both groups during social conversations. Addressing these challenges is imperative for the successful integration of robots in conversational practice with L2 learners.

    The thesis then explores the potential advantages of employing social robots in collaborative learning environments with multi-party interactions. It delves into strategies for improving speaking practice, including the use of non-verbal behaviors to encourage learners to speak. For instance, a robot's adaptive gazing behavior is used to effectively balance speaking contributions between L1 and L2 pairs of participants. Moreover, an adaptive use of encouraging backchannels significantly increases the speaking time of L2 learners.

    Finally, the thesis highlights the importance of further research on cultural aspects in human-robot interactions. One study reveals distinct responses among various socio-cultural groups in interaction between L1 and L2 participants. For example, factors such as gender, age, extroversion, and familiarity with robots influence conversational engagement of L2 speakers. Additionally, another study investigates preconceptions related to the appearance and accents of nationality-encoded (virtual and physical) social robots. The results indicate that initial perceptions may lead to negative preconceptions, but that these perceptions diminish after actual interactions.

    Despite technical limitations, social robots provide distinct benefits in supporting educational endeavors. This thesis emphasizes the potential of social robots as effective facilitators of spoken language practice for adult learners, advocating for continued exploration at the intersection of language education, human-robot interaction, and technology.

    Download full text (pdf)
    Kappa
  • 3.
    Cumbal, Ronald
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Axelsson, Agnes
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Mehta, Shivam
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Engwall, Olov
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Stereotypical nationality representations in HRI: perspectives from international young adults2023In: Frontiers in Robotics and AI, E-ISSN 2296-9144, Vol. 10, article id 1264614Article in journal (Refereed)
    Abstract [en]

    People often form immediate expectations about other people, or groups of people, based on visual appearance and characteristics of their voice and speech. These stereotypes, often inaccurate or overgeneralized, may translate to robots that carry human-like qualities. This study aims to explore if nationality-based preconceptions regarding appearance and accents can be found in people's perception of a virtual and a physical social robot. In an online survey with 80 subjects evaluating different first-language-influenced accents of English and nationality-influenced human-like faces for a virtual robot, we find that accents, in particular, lead to preconceptions on perceived competence and likeability that correspond to previous findings in social science research. In a physical interaction study with 74 participants, we then studied if the perception of competence and likeability is similar after interacting with a robot portraying one of four different nationality representations from the online survey. We find that preconceptions on national stereotypes that appeared in the online survey vanish or are overshadowed by factors related to general interaction quality. We do, however, find some effects of the robot's stereotypical alignment with the subject group, with Swedish subjects (the majority group in this study) rating the Swedish-accented robot as less competent than the international group, but, on the other hand, recalling more facts from the Swedish robot's presentation than the international group does. In an extension in which the physical robot was replaced by a virtual robot interacting in the same scenario online, we further found the same results that preconceptions are of less importance after actual interactions, hence demonstrating that the differences in the ratings of the robot between the online survey and the interaction is not due to the interaction medium. We hence conclude that attitudes towards stereotypical national representations in HRI have a weak effect, at least for the user group included in this study (primarily educated young students in an international setting).

  • 4.
    Cumbal, Ronald
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    David Lopes, José
    Herriot-watt University.
    Engwall, Olov
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Detection of Listener Uncertainty in Robot-Led Second Language Conversation Practice2020In: Proceedings ICMI '20: International Conference on Multimodal Interaction, Association for Computing Machinery (ACM) , 2020Conference paper (Refereed)
    Abstract [en]

    Uncertainty is a frequently occurring affective state that learners ex-perience during the acquisition of a second language. This state canconstitute both a learning opportunity and a source of learner frus-tration. An appropriate detection could therefore benefit the learn-ing process by reducing cognitive instability. In this study, we usea dyadic practice conversation between an adult second-languagelearner and a social robot to elicit events of uncertainty throughthe manipulation of the robot’s spoken utterances (increased lex-ical complexity or prosody modifications). The characteristics ofthese events are then used to analyze multi-party practice conver-sations between a robot and two learners. Classification models aretrained with multimodal features from annotated events of listener(un)certainty. We report the performance of our models on differentsettings, (sub)turn segments and multimodal inputs.

    Download full text (pdf)
    fulltext
  • 5.
    Cumbal, Ronald
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Engwall, Olov
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Speech Communication and Technology.
    Speaking Transparently: Social Robots in Educational Settings2024In: Companion of the 2024 ACM/IEEE International Conference on Human-Robot Interaction (HRI '24 Companion), March 11--14, 2024, Boulder, CO, USA, 2024Conference paper (Refereed)
    Abstract [en]

    The recent surge in popularity of Large Language Models, known for their inherent opacity, has increased the interest in fostering transparency in technology designed for human interaction. This concern is equally prevalent in the development of Social Robots, particularly when these are designed to engage in critical areas of our society, such as education or healthcare. In this paper we propose an experiment to investigate how users can be made aware of the automated decision processes when interacting in a discussion with a social robot. Our main objective is to assess the effectiveness of verbal expressions in fostering transparency within groups of individuals as they engage with a robot. We describe the proposed interactive settings, system design, and our approach to enhance the transparency in a robot's decision-making process for multi-party interactions.

    Download full text (pdf)
    fulltext
  • 6.
    Cumbal, Ronald
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Kazzi, Daniel Alexander
    KTH.
    Winberg, Vincent
    KTH.
    Engwall, Olov
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Shaping unbalanced multi-party interactions through adaptive robot backchannels2022In: IVA 2022 - Proceedings of the 22nd ACM International Conference on Intelligent Virtual Agents, Association for Computing Machinery, Inc , 2022Conference paper (Refereed)
    Abstract [en]

    Non-verbal cues used in human communication have shown to be efficient in shaping speaking interactions. When applied to virtual agents or social robots, results imply that a similar effect is expected in dyad settings. In this study, we explore how encouraging, vocal and non-vocal, backchannels can stimulate speaking participation in a game-based multi-party interaction, where unbalanced contribution is expected. We design the study using a social robot, taking part in a language game with native speakers and language learners, to evaluate how an adaptive generation of backchannels, that targets the least speaking participant to encourage more speaking contribution, affects the group and individual participant's behavior. We report results from experiments with 30 subjects divided in pairs assigned to the adaptive (encouraging) and (neutral) control condition. Our results show that the speaking participation of the least active speaker increases significantly when the robot uses an adaptive backchanneling strategy. At the same time, the participation of the more active speaker slightly decreases, which causes the combined speaking time of both participants to be similar between the Control and Experimental conditions. The adaptive strategy further leads to a 50% decrease in the difference in speaker shares between the two participants (indicating a more balanced participation) compared to the Control condition. However, this distribution between speaker ratios is not significantly different from the Control.

  • 7.
    Cumbal, Ronald
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Lopes, J.
    Heriot-Watt University, Edinburgh, United Kingdom.
    Engwall, Olov
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Uncertainty in robot assisted second language conversation practice2020In: HRI '20: Companion of the 2020 ACM/IEEE International Conference on Human-Robot Interaction, Association for Computing Machinery (ACM) , 2020, p. 171-173Conference paper (Refereed)
    Abstract [en]

    Moments of uncertainty are common for learners when practicing a second language. The appropriate management of these events could help avoid the development of frustration and benefit the learner's experience. Therefore, its detection is crucial in language practice conversations. In this study, an experimental conversation between an adult second language learner and a social robot is employed to visually characterize the learners' uncertainty. The robot's output is manipulated in prosody and lexical levels to provoke uncertainty during the conversation. These reactions are then processed to obtain Facial Action Units (AUs) and Gaze features. Preliminary results show distinctive behavioral patterns of uncertainty among the participants. Based on these results, a new annotation scheme is proposed, which will expand the data used to train sequential models to detect uncertainty. As future steps, the robotic conversational partner will use this information to adapt its behavior in dialogue generation and language complexity.

    Download full text (pdf)
    fulltext
  • 8.
    Cumbal, Ronald
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Moell, Birger
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Águas Lopes, José David
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Engwall, Olov
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Speech Communication and Technology.
    “You don’t understand me!”: Comparing ASR Results for L1 and L2 Speakers of Swedish2021In: Proceedings Interspeech 2021, International Speech Communication Association , 2021, p. 96-100Conference paper (Refereed)
    Abstract [en]

    The performance of Automatic Speech Recognition (ASR)systems has constantly increased in state-of-the-art develop-ment. However, performance tends to decrease considerably inmore challenging conditions (e.g., background noise, multiplespeaker social conversations) and with more atypical speakers(e.g., children, non-native speakers or people with speech dis-orders), which signifies that general improvements do not nec-essarily transfer to applications that rely on ASR, e.g., educa-tional software for younger students or language learners. Inthis study, we focus on the gap in performance between recog-nition results for native and non-native, read and spontaneous,Swedish utterances transcribed by different ASR services. Wecompare the recognition results using Word Error Rate and an-alyze the linguistic factors that may generate the observed tran-scription errors.

    Download full text (pdf)
    fulltext
  • 9.
    Engwall, Olov
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Cumbal, Ronald
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Lopes, Jose
    Heriot Watt Univ, Edinburgh, Midlothian, Scotland..
    Ljung, Mikael
    KTH.
    Månsson, Linnea
    KTH.
    Identification of Low-engaged Learners in Robot-led Second Language Conversations with Adults2022In: ACM Transactions on Human-Robot Interaction, E-ISSN 2573-9522, Vol. 11, no 2, article id 18Article in journal (Refereed)
    Abstract [en]

    The main aim of this study is to investigate if verbal, vocal, and facial information can be used to identify low-engaged second language learners in robot-led conversation practice. The experiments were performed on voice recordings and video data from 50 conversations, in which a robotic head talks with pairs of adult language learners using four different interaction strategies with varying robot-learner focus and initiative. It was found that these robot interaction strategies influenced learner activity and engagement. The verbal analysis indicated that learners with low activity rated the robot significantly lower on two out of four scales related to social competence. The acoustic vocal and video-based facial analysis, based on manual annotations or machine learning classification, both showed that learners with low engagement rated the robot's social competencies consistently, and in several cases significantly, lower, and in addition rated the learning effectiveness lower. The agreement between manual and automatic identification of low-engaged learners based on voice recordings or face videos was further found to be adequate for future use. These experiments constitute a first step towards enabling adaption to learners' activity and engagement through within- and between-strategy changes of the robot's interaction with learners.

  • 10.
    Engwall, Olov
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Speech Communication and Technology.
    Cumbal, Ronald
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Majlesi, Ali Reza
    Socio-cultural perception of robot backchannels2023In: Frontiers in Robotics and AI, E-ISSN 2296-9144, Vol. 10Article in journal (Refereed)
    Abstract [en]

    Introduction: Backchannels, i.e., short interjections by an interlocutor to indicate attention, understanding or agreement regarding utterances by another conversation participant, are fundamental in human-human interaction. Lack of backchannels or if they have unexpected timing or formulation may influence the conversation negatively, as misinterpretations regarding attention, understanding or agreement may occur. However, several studies over the years have shown that there may be cultural differences in how backchannels are provided and perceived and that these differences may affect intercultural conversations. Culturally aware robots must hence be endowed with the capability to detect and adapt to the way these conversational markers are used across different cultures. Traditionally, culture has been defined in terms of nationality, but this is more and more considered to be a stereotypic simplification. We therefore investigate several socio-cultural factors, such as the participants’ gender, age, first language, extroversion and familiarity with robots, that may be relevant for the perception of backchannels.

    Methods: We first cover existing research on cultural influence on backchannel formulation and perception in human-human interaction and on backchannel implementation in Human-Robot Interaction. We then present an experiment on second language spoken practice, in which we investigate how backchannels from the social robot Furhat influence interaction (investigated through speaking time ratios and ethnomethodology and multimodal conversation analysis) and impression of the robot (measured by post-session ratings). The experiment, made in a triad word game setting, is focused on if activity-adaptive robot backchannels may redistribute the participants’ speaking time ratio, and/or if the participants’ assessment of the robot is influenced by the backchannel strategy. The goal is to explore how robot backchannels should be adapted to different language learners to encourage their participation while being perceived as socio-culturally appropriate.

    Results: We find that a strategy that displays more backchannels towards a less active speaker may substantially decrease the difference in speaking time between the two speakers, that different socio-cultural groups respond differently to the robot’s backchannel strategy and that they also perceive the robot differently after the session.

    Discussion: We conclude that the robot may need different backchanneling strategies towards speakers from different socio-cultural groups in order to encourage them to speak and have a positive perception of the robot.

     

    Download full text (pdf)
    fulltext
  • 11.
    Engwall, Olov
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Cumbal, Ronald
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Águas Lopes, José David
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Ljung, Mikael
    Månsson, Linnea
    Identification of low-engaged learners in robot-led second language conversations with adultsManuscript (preprint) (Other academic)
    Abstract [en]

    The main aim of this study is to investigate if verbal, vocal and facial information can be used to identify low-engaged second language learners in robot-led conversation practice. The experiments were performed on voice recordings and video data from 50 conversations, in which a robotic head talks with pairs of adult language learners using four different interaction strategies with varying robot-learner focus and initiative. It was found that these robot interaction strategies influenced learner activity and engagement. The verbal analysis indicated that learners with low activity rated the robot significantly lower on two out of four scales related to social competence. The acoustic vocal and video-based facial analysis, based on manual annotations or machine learning classification, both showed that learners with low engagement rated the robot’s social competencies consistently, and in several cases significantly, lower, and in addition rated the learning effectiveness lower. The agreement between manual and automatic identification of low-engaged learners based on voice recordings or face videos was further found to be adequate for future use. These experiments constitute a first step towards enabling adaption to learners’ activity and engagement through within- and between-strategy changes of the robot’s interaction with learners. 

    Download full text (pdf)
    fulltext
  • 12.
    Engwall, Olov
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Speech Communication and Technology.
    Lopes, J.
    Cumbal, Ronald
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Berndtson, Gustav
    KTH, School of Industrial Engineering and Management (ITM).
    Lindström, Ruben
    KTH, School of Electrical Engineering and Computer Science (EECS).
    Ekman, Patrik
    KTH.
    Hartmanis, Eric
    KTH, School of Electrical Engineering and Computer Science (EECS).
    Jin, Emelie
    KTH, School of Industrial Engineering and Management (ITM).
    Johnston, Ella
    KTH, School of Industrial Engineering and Management (ITM).
    Tahir, Gara
    KTH, School of Industrial Engineering and Management (ITM).
    Mekonnen, Michael
    KTH.
    Learner and teacher perspectives on robot-led L2 conversation practice2022In: ReCALL, ISSN 0958-3440, E-ISSN 1474-0109, Vol. 34, no 3, p. 344-359Article in journal (Refereed)
    Abstract [en]

    This article focuses on designing and evaluating conversation practice in a second language (L2) with a robot that employs human spoken and non-verbal interaction strategies. Based on an analysis of previous work and semi-structured interviews with L2 learners and teachers, recommendations for robot-led conversation practice for adult learners at intermediate level are first defined, focused on language learning, on the social context, on the conversational structure and on verbal and visual aspects of the robot moderation. Guided by these recommendations, an experiment is set up, in which 12 pairs of L2 learners of Swedish interact with a robot in short social conversations. These robot-learner interactions are evaluated through post-session interviews with the learners, teachers' ratings of the robot's behaviour and analyses of the video-recorded conversations, resulting in a set of guidelines for robot-led conversation practice: (1) societal and personal topics increase the practice's meaningfulness for learners; (2) strategies and methods for providing corrective feedback during conversation practice need to be explored further; (3) learners should be encouraged to support each other if the robot has difficulties adapting to their linguistic level; (4) the robot should establish a social relationship by contributing with its own story, remembering the participants' input, and making use of non-verbal communication signals; and (5) improvements are required regarding naturalness and intelligibility of text-to-speech synthesis, in particular its speed, if it is to be used for conversations with L2 learners. 

  • 13.
    Engwall, Olov
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Lopes, José
    Herriot-Watt University.
    Cumbal, Ronald
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Berndtson, Gustav
    KTH.
    Lindström, Ruben
    KTH.
    Ekman, Patrik
    KTH.
    Hartmanis, Eric
    KTH.
    Jin, Emelie
    KTH.
    Johnston, Ella
    KTH.
    Tahir, Gara
    KTH.
    Mekonnen, Michael
    KTH.
    Learner and teacher perspectives on robot-led L2 conversation practiceManuscript (preprint) (Other academic)
    Abstract [en]

    This article focuses on designing and evaluating conversation practice in a second language (L2) with a robot that employs human spoken and non-verbal interaction strategies. Based on an analysis of previous work and semi-structured interviews with L2 learners and teachers, recommendations for robot-led conversation practice for adult learners at intermediate level are first defined, focused on language learning, on the social context, on the conversational structure and on verbal and visual aspects of the robot moderation. Guided by these recommendations, an experiment is set up, in which 12 pairs of L2 learners of Swedish interact with a robot in short social conversations. These robot-learner interactions are evaluated through post-session interviews with the learners, teachers’ ratings of the robot’s behaviour and analyses of the video-recorded conversations, resulting in a set of guidelines for robot-led conversation practice, in particular: 1) Societal and personal topics increase the practice’s meaningfulness for learners. 2) Strategies and methods for providing corrective feedback during conversation practice need to be explored further. 3) Learners should be encouraged to support each other if the robot has difficulties adapting to their linguistic level. 4) The robot should establish a social relationship, by contributing with its own story, remembering the participants’ input, and making use of non-verbal communication signals. 5) Improvements are required regarding naturalness and intelligibility of text-to-speech synthesis, in particular its speed, if it is to be used for conversations with L2 learners.

    Download full text (pdf)
    fulltext
  • 14.
    Engwall, Olov
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH, Speech Communication and Technology.
    Águas Lopes, José David
    Cumbal, Ronald
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Is a Wizard-of-Oz Required for Robot-Led Conversation Practice in a Second Language?2022In: International Journal of Social Robotics, ISSN 1875-4791, E-ISSN 1875-4805Article in journal (Refereed)
    Abstract [en]

    The large majority of previous work on human-robot conversations in a second language has been performed with a human wizard-of-Oz. The reasons are that automatic speech recognition of non-native conversational speech is considered to be unreliable and that the dialogue management task of selecting robot utterances that are adequate at a given turn is complex in social conversations. This study therefore investigates if robot-led conversation practice in a second language with pairs of adult learners could potentially be managed by an autonomous robot. We first investigate how correct and understandable transcriptions of second language learner utterances are when made by a state-of-the-art speech recogniser. We find both a relatively high word error rate (41%) and that a substantial share (42%) of the utterances are judged to be incomprehensible or only partially understandable by a human reader. We then evaluate how adequate the robot utterance selection is, when performed manually based on the speech recognition transcriptions or autonomously using (a) predefined sequences of robot utterances, (b) a general state-of-the-art language model that selects utterances based on learner input or the preceding robot utterance, or (c) a custom-made statistical method that is trained on observations of the wizard’s choices in previous conversations. It is shown that adequate or at least acceptable robot utterances are selected by the human wizard in most cases (96%), even though the ASR transcriptions have a high word error rate. Further, the custom-made statistical method performs as well as manual selection of robot utterances based on ASR transcriptions. It was also found that the interaction strategy that the robot employed, which differed regarding how much the robot maintained the initiative in the conversation and if the focus of the conversation was on the robot or the learners, had marginal effects on the word error rate and understandability of the transcriptions but larger effects on the adequateness of the utterance selection. Autonomous robot-led conversations may hence work better with some robot interaction strategies.

    Download full text (pdf)
    fulltext
  • 15.
    Engwall, Olov
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Águas Lopes, José David
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Cumbal, Ronald
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Berndtson, Gustav
    Lindström, Ruben
    Ekman, Patrik
    Hartmanis, Eric
    Jin, Emelie
    Johnston, Ella
    Tahir, Gara
    Mekonnen, Michael
    Learner and teacher perspectives on robot-led L2 conversation practiceIn: Article in journal (Refereed)
    Abstract [en]

    This article focuses on designing and evaluating conversation practice in a second language (L2) with a robot that employs human spoken and non-verbal interaction strategies. Based on an analysis of previous work and semi-structured interviews with L2 learners and teachers, recommendations for robot-led conversation practice for adult learners at intermediate level are first defined, focused on language learning, on the social context, on the conversational structure and on verbal and visual aspects of the robot moderation. Guided by these recommendations, an experiment is set up, in which 12 pairs of L2 learners of Swedish interact with a robot in short social conversations. These robot-learner interactions are evaluated through post-session interviews with the learners, teachers’ ratings of the robot’s behaviour and analyses of the video-recorded conversations, resulting in a set of guidelines for robot-led conversation practice, in particular: 1) Societal and personal topics increase the practice’s meaningfulness for learners. 2) Strategies and methods for providing corrective feedback during conversation practice need to be explored further. 3) Learners should be encouraged to support each other if the robot has difficulties adapting to their linguistic level. 4) The robot should establish a social relationship, by contributing with its own story, remembering the participants’ input, and making use of non-verbal communication signals. 5) Improvements are required regarding naturalness and intelligibility of text-to-speech synthesis, in particular its speed, if it is to be used for conversations with L2 learners. 

    Download full text (pdf)
    fulltext
  • 16.
    Gillet, Sarah
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.
    Cumbal, Ronald
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Abelho Pereira, André Tiago
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Lopes, José
    Heriot-Watt University.
    Engwall, Olov
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Leite, Iolanda
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.
    Robot Gaze Can Mediate Participation Imbalance in Groups with Different Skill Levels2021In: Proceedings of the 2021 ACM/IEEE International Conference on Human-Robot Interaction, Association for Computing Machinery , 2021, p. 303-311Conference paper (Refereed)
    Abstract [en]

    Many small group activities, like working teams or study groups, have a high dependency on the skill of each group member. Differences in skill level among participants can affect not only the performance of a team but also influence the social interaction of its members. In these circumstances, an active member could balance individual participation without exerting direct pressure on specific members by using indirect means of communication, such as gaze behaviors. Similarly, in this study, we evaluate whether a social robot can balance the level of participation in a language skill-dependent game, played by a native speaker and a second language learner. In a between-subjects study (N = 72), we compared an adaptive robot gaze behavior, that was targeted to increase the level of contribution of the least active player, with a non-adaptive gaze behavior. Our results imply that, while overall levels of speech participation were influenced predominantly by personal traits of the participants, the robot’s adaptive gaze behavior could shape the interaction among participants which lead to more even participation during the game.

  • 17.
    McMillan, Donald
    et al.
    Stockholm University, Stockholm, Sweden.
    Jaber, Razan
    University College Dublin, Dublin, Ireland.
    Cowan, Benjamin R.
    University College Dublin, Dublin, Ireland.
    Fischer, Joel E.
    University of Nottingham, Nottingham, United Kingdom.
    Irfan, Bahar
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Cumbal, Ronald
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Zargham, Nima
    Digital Media Lab, University of Bremen, Germany.
    Lee, Minha
    Eindhoven University of Technology, Eindhoven, The Netherlands.
    Human-Robot Conversational Interaction (HRCI)2023In: HRI 2023: Companion of the ACM/IEEE International Conference on Human-Robot Interaction, Association for Computing Machinery (ACM) , 2023, p. 923-925Conference paper (Refereed)
    Abstract [en]

    Conversation is one of the primary methods of interaction between humans and robots. It provides a natural way of communication with the robot, thereby reducing the obstacles that can be faced through other interfaces (e.g., text or touch) that may cause difficulties to certain populations, such as the elderly or those with disabilities, promoting inclusivity in Human-Robot Interaction (HRI).Work in HRI has contributed significantly to the design, understanding and evaluation of human-robot conversational interactions. Concurrently, the Conversational User Interfaces (CUI) community has developed with similar aims, though with a wider focus on conversational interactions across a range of devices and platforms. This workshop aims to bring together the CUI and HRI communities to outline key shared opportunities and challenges in developing conversational interactions with robots, resulting in collaborative publications targeted at the CUI 2023 provocations track.

  • 18.
    Weldon, Catherine F.
    et al.
    KTH.
    Gillet, Sarah
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.
    Cumbal, Ronald
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Leite, Iolanda
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.
    Exploring non-verbal gaze behavior in groups mediated by an adaptive robot2021In: ACM/IEEE International Conference on Human-Robot Interaction, IEEE Computer Society , 2021, p. 357-361Conference paper (Refereed)
    Abstract [en]

    In this study, non-verbal behavior in diversely-skilled groups was observed while participating in a collaborative educational game with a humanoid robot. Research has indicated that a mediating robot gaze can equalize the verbal contributions from each differently skilled participant, promoting inclusion and learning. The experiment results were further analyzed, extending to non-verbal effects. The initial results from two experiments under different robot gaze behavior indicate that modifications in the robot gaze can lead to different gaze behavior in participants. It was observed that a gaze mediating behavior in the robot led to increased gaze change frequency among participants as well as more time spent mirroring the robot's gaze. These initial results show promise in how a robot can balance attention in a collaborative learning environment. 

1 - 18 of 18
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf