Ändra sökning
Avgränsa sökresultatet
12 1 - 50 av 87
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Träffar per sida
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
Markera
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1. Al Moubayed, S.
    et al.
    Bohus, D.
    Esposito, A.
    Heylen, D.
    Koutsombogera, M.
    Papageorgiou, H.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    UM3I 2014 chairs' welcome2014Ingår i: UM3I 2014 - Proceedings of the 2014 ACM Workshop on Understanding and Modeling Multiparty, Multimodal Interactions, Co-located with ICMI 2014, 2014, s. iii-Konferensbidrag (Övrig (populärvetenskap, debatt, mm))
  • 2.
    Al Moubayed, Samer
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Beskow, Jonas
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Blomberg, Mats
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Granström, Björn
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Gustafson, Joakim
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Mirning, N.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Talking with Furhat - multi-party interaction with a back-projected robot head2012Ingår i: Proceedings of Fonetik 2012, Gothenberg, Sweden, 2012, s. 109-112Konferensbidrag (Övrigt vetenskapligt)
    Abstract [en]

    This is a condensed presentation of some recent work on a back-projected robotic head for multi-party interaction in public settings. We will describe some of the design strategies and give some preliminary analysis of an interaction database collected at the Robotville exhibition at the London Science Museum

  • 3.
    Al Moubayed, Samer
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Beskow, Jonas
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Bollepalli, Bajibabu
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Gustafson, Joakim
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Hussen-Abdelaziz, A.
    Johansson, Martin
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Koutsombogera, M.
    Lopes, J. D.
    Novikova, J.
    Oertel, Catharine
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Stefanov, Kalin
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Varol, G.
    Human-robot Collaborative Tutoring Using Multiparty Multimodal Spoken Dialogue2014Konferensbidrag (Refereegranskat)
    Abstract [en]

    In this paper, we describe a project that explores a novel experi-mental setup towards building a spoken, multi-modally rich, and human-like multiparty tutoring robot. A human-robotinteraction setup is designed, and a human-human dialogue corpus is collect-ed. The corpus targets the development of a dialogue system platform to study verbal and nonverbaltutoring strategies in mul-tiparty spoken interactions with robots which are capable of spo-ken dialogue. The dialogue task is centered on two participants involved in a dialogueaiming to solve a card-ordering game. Along with the participants sits a tutor (robot) that helps the par-ticipants perform the task, and organizes and balances their inter-action. Differentmultimodal signals captured and auto-synchronized by different audio-visual capture technologies, such as a microphone array, Kinects, and video cameras, were coupled with manual annotations. These are used build a situated model of the interaction based on the participants personalities, their state of attention, their conversational engagement and verbal domi-nance, and how that is correlated with the verbal and visual feed-back, turn-management, and conversation regulatory actions gen-erated by the tutor. Driven by the analysis of the corpus, we will show also the detailed design methodologies for an affective, and multimodally rich dialogue system that allows the robot to meas-ure incrementally the attention states, and the dominance for each participant, allowing the robot head Furhat to maintain a well-coordinated, balanced, and engaging conversation, that attempts to maximize the agreement and the contribution to solve the task. This project sets the first steps to explore the potential of us-ing multimodal dialogue systems to build interactive robots that can serve in educational, team building, and collaborative task solving applications.

  • 4.
    Al Moubayed, Samer
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Beskow, Jonas
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Bollepalli, Bajibabu
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Hussen-Abdelaziz, A.
    Johansson, Martin
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Koutsombogera, M.
    Lopes, J.
    Novikova, J.
    Oertel, Catharine
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Stefanov, Kalin
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Varol, G.
    Tutoring Robots: Multiparty Multimodal Social Dialogue With an Embodied Tutor2014Konferensbidrag (Refereegranskat)
    Abstract [en]

    This project explores a novel experimental setup towards building spoken, multi-modally rich, and human-like multiparty tutoring agent. A setup is developed and a corpus is collected that targets the development of a dialogue system platform to explore verbal and nonverbal tutoring strategies in multiparty spoken interactions with embodied agents. The dialogue task is centered on two participants involved in a dialogue aiming to solve a card-ordering game. With the participants sits a tutor that helps the participants perform the task and organizes and balances their interaction. Different multimodal signals captured and auto-synchronized by different audio-visual capture technologies were coupled with manual annotations to build a situated model of the interaction based on the participants personalities, their temporally-changing state of attention, their conversational engagement and verbal dominance, and the way these are correlated with the verbal and visual feedback, turn-management, and conversation regulatory actions generated by the tutor. At the end of this chapter we discuss the potential areas of research and developments this work opens and some of the challenges that lie in the road ahead.

  • 5.
    Al Moubayed, Samer
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Beskow, Jonas
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Granström, Björn
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Gustafson, Joakim
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Mirning, Nicole
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Tscheligi, Manfred
    Furhat goes to Robotville: a large-scale multiparty human-robot interaction data collection in a public space2012Ingår i: Proc of LREC Workshop on Multimodal Corpora, Istanbul, Turkey, 2012Konferensbidrag (Refereegranskat)
    Abstract [en]

    In the four days of the Robotville exhibition at the London Science Museum, UK, during which the back-projected head Furhat in a situated spoken dialogue system was seen by almost 8 000 visitors, we collected a database of 10 000 utterances spoken to Furhat in situated interaction. The data collection is an example of a particular kind of corpus collection of human-machine dialogues in public spaces that has several interesting and specific characteristics, both with respect to the technical details of the collection and with respect to the resulting corpus contents. In this paper, we take the Furhat data collection as a starting point for a discussion of the motives for this type of data collection, its technical peculiarities and prerequisites, and the characteristics of the resulting corpus.

  • 6.
    Al Moubayed, Samer
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Beskow, Jonas
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Spontaneous spoken dialogues with the Furhat human-like robot head2014Ingår i: HRI '14 Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction, Bielefeld, Germany, 2014, s. 326-Konferensbidrag (Refereegranskat)
    Abstract [en]

    We will show in this demonstrator an advanced multimodal and multiparty spoken conversational system using Furhat, a robot head based on projected facial animation. Furhat is an anthropomorphic robot head that utilizes facial animation for physical robot heads using back-projection. In the system, multimodality is enabled using speech and rich visual input signals such as multi-person real-time face tracking and microphone tracking. The demonstrator will showcase a system that is able to carry out social dialogue with multiple interlocutors simultaneously with rich output signals such as eye and head coordination, lips synchronized speech synthesis, and non-verbal facial gestures used to regulate fluent and expressive multiparty conversations. The dialogue design is performed using the IrisTK [4] dialogue authoring toolkit developed at KTH. The system will also be able to perform a moderator in a quiz-game showing different strategies for regulating spoken situated interactions.

  • 7.
    Al Moubayed, Samer
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Beskow, Jonas
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    The Furhat Social Companion Talking Head2013Ingår i: Interspeech 2013 - Show and Tell, 2013, s. 747-749Konferensbidrag (Refereegranskat)
    Abstract [en]

    In this demonstrator we present the Furhat robot head. Furhat is a highly human-like robot head in terms of dynamics, thanks to its use of back-projected facial animation. Furhat also takes advantage of a complex and advanced dialogue toolkits designed to facilitate rich and fluent multimodal multiparty human-machine situated and spoken dialogue. The demonstrator will present a social dialogue system with Furhat that allows for several simultaneous interlocutors, and takes advantage of several verbal and nonverbal input signals such as speech input, real-time multi-face tracking, and facial analysis, and communicates with its users in a mixed initiative dialogue, using state of the art speech synthesis, with rich prosody, lip animated facial synthesis, eye and head movements, and gestures.

  • 8.
    Al Moubayed, Samer
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Beskow, Jonas
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Granström, Björn
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Furhat: A Back-projected Human-like Robot Head for Multiparty Human-Machine Interaction2012Ingår i: Cognitive Behavioural Systems: COST 2102 International Training School, Dresden, Germany, February 21-26, 2011, Revised Selected Papers / [ed] Anna Esposito, Antonietta M. Esposito, Alessandro Vinciarelli, Rüdiger Hoffmann, Vincent C. Müller, Springer Berlin/Heidelberg, 2012, s. 114-130Konferensbidrag (Refereegranskat)
    Abstract [en]

    In this chapter, we first present a summary of findings from two previous studies on the limitations of using flat displays with embodied conversational agents (ECAs) in the contexts of face-to-face human-agent interaction. We then motivate the need for a three dimensional display of faces to guarantee accurate delivery of gaze and directional movements and present Furhat, a novel, simple, highly effective, and human-like back-projected robot head that utilizes computer animation to deliver facial movements, and is equipped with a pan-tilt neck. After presenting a detailed summary on why and how Furhat was built, we discuss the advantages of using optically projected animated agents for interaction. We discuss using such agents in terms of situatedness, environment, context awareness, and social, human-like face-to-face interaction with robots where subtle nonverbal and social facial signals can be communicated. At the end of the chapter, we present a recent application of Furhat as a multimodal multiparty interaction system that was presented at the London Science Museum as part of a robot festival,. We conclude the paper by discussing future developments, applications and opportunities of this technology.

  • 9.
    Al Moubayed, Samer
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Heylen, D.
    Bohus, D.
    Koutsombogera, Maria
    Papageorgiou, H.
    Esposito, A.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    UM3I 2014: International workshop on understanding and modeling multiparty, multimodal interactions2014Ingår i: ICMI 2014 - Proceedings of the 2014 International Conference on Multimodal Interaction, Association for Computing Machinery (ACM), 2014, s. 537-538Konferensbidrag (Refereegranskat)
    Abstract [en]

    In this paper, we present a brief summary of the international workshop on Modeling Multiparty, Multimodal Interactions. The UM3I 2014 workshop is held in conjunction with the ICMI 2014 conference. The workshop will highlight recent developments and adopted methodologies in the analysis and modeling of multiparty and multimodal interactions, the design and implementation principles of related human-machine interfaces, as well as the identification of potential limitations and ways of overcoming them.

  • 10.
    Al Moubayed, Samer
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Effects of 2D and 3D Displays on Turn-taking Behavior in Multiparty Human-Computer Dialog2011Ingår i: SemDial 2011: Proceedings of the 15th Workshop on the Semantics and Pragmatics of Dialogue / [ed] Ron Artstein, Mark Core, David DeVault, Kallirroi Georgila, Elsi Kaiser, Amanda Stent, Los Angeles, CA, 2011, s. 192-193Konferensbidrag (Refereegranskat)
    Abstract [en]

    The perception of gaze from an animated agenton a 2D display has been shown to suffer fromthe Mona Lisa effect, which means that exclusive mutual gaze cannot be established if there is more than one observer. In this study, we investigate this effect when it comes to turntaking control in a multi-party human-computerdialog setting, where a 2D display is compared to a 3D projection. The results show that the 2D setting results in longer response times andlower turn-taking accuracy.

  • 11.
    Al Moubayed, Samer
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Perception of Gaze Direction for Situated Interaction2012Ingår i: Proceedings of the 4th Workshop on Eye Gaze in Intelligent Human Machine Interaction, Gaze-In 2012, ACM , 2012Konferensbidrag (Refereegranskat)
    Abstract [en]

    Accurate human perception of robots' gaze direction is crucial for the design of a natural and fluent situated multimodal face-to-face interaction between humans and machines. In this paper, we present an experiment targeted at quantifying the effects of different gaze cues synthesized using the Furhat back-projected robot head, on the accuracy of perceived spatial direction of gaze by humans using 18 test subjects. The study first quantifies the accuracy of the perceived gaze direction in a human-human setup, and compares that to the use of synthesized gaze movements in different conditions: viewing the robot eyes frontal or at a 45 degrees angle side view. We also study the effect of 3D gaze by controlling both eyes to indicate the depth of the focal point (vergence), the use of gaze or head pose, and the use of static or dynamic eyelids. The findings of the study are highly relevant to the design and control of robots and animated agents in situated face-to-face interaction.

  • 12.
    Al Moubayed, Samer
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Turn-taking Control Using Gaze in Multiparty Human-Computer Dialogue: Effects of 2D and 3D Displays2011Ingår i: Proceedings of the International Conference on Audio-Visual Speech Processing 2011, Stockholm: KTH Royal Institute of Technology, 2011, s. 99-102Konferensbidrag (Refereegranskat)
    Abstract [en]

    In a previous experiment we found that the perception of gazefrom an animated agent on a two-dimensional display suffersfrom the Mona Lisa effect, which means that exclusive mutual gaze cannot be established if there is more than one observer. By using a three-dimensional projection surface, this effect can be eliminated. In this study, we investigate whether this difference also holds for the turn-taking behaviour of subjects interacting with the animated agent in a multi-party dialogue. We present a Wizard-of-Oz experiment where five subjects talk toan animated agent in a route direction dialogue. The results show that the subjects to some extent can infer the intended target of the agent’s questions, in spite of the Mona Lisa effect, but that the accuracy of gaze when it comes to selecting an addressee is still significantly lower in the 2D condition, ascompared to the 3D condition. The response time is also significantly longer in the 2D condition, indicating that the inference of intended gaze may require additional cognitive efforts.

  • 13.
    Al Moubayed, Samer
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Beskow, Jonas
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Lip-reading: Furhat audio visual intelligibility of a back projected animated face2012Ingår i: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Springer Berlin/Heidelberg, 2012, s. 196-203Konferensbidrag (Refereegranskat)
    Abstract [en]

    Back projecting a computer animated face, onto a three dimensional static physical model of a face, is a promising technology that is gaining ground as a solution to building situated, flexible and human-like robot heads. In this paper, we first briefly describe Furhat, a back projected robot head built for the purpose of multimodal multiparty human-machine interaction, and its benefits over virtual characters and robotic heads; and then motivate the need to investigating the contribution to speech intelligibility Furhat's face offers. We present an audio-visual speech intelligibility experiment, in which 10 subjects listened to short sentences with degraded speech signal. The experiment compares the gain in intelligibility between lip reading a face visualized on a 2D screen compared to a 3D back-projected face and from different viewing angles. The results show that the audio-visual speech intelligibility holds when the avatar is projected onto a static face model (in the case of Furhat), and even, rather surprisingly, exceeds it. This means that despite the movement limitations back projected animated face models bring about; their audio visual speech intelligibility is equal, or even higher, compared to the same models shown on flat displays. At the end of the paper we discuss several hypotheses on how to interpret the results, and motivate future investigations to better explore the characteristics of visual speech perception 3D projected faces.

  • 14.
    Al Moubayed, Samer
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Beskow, Jonas
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    The Furhat Back-Projected Humanoid Head-Lip Reading, Gaze And Multi-Party Interaction2013Ingår i: International Journal of Humanoid Robotics, ISSN 0219-8436, Vol. 10, nr 1, s. 1350005-Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    In this paper, we present Furhat - a back-projected human-like robot head using state-of-the art facial animation. Three experiments are presented where we investigate how the head might facilitate human - robot face-to-face interaction. First, we investigate how the animated lips increase the intelligibility of the spoken output, and compare this to an animated agent presented on a flat screen, as well as to a human face. Second, we investigate the accuracy of the perception of Furhat's gaze in a setting typical for situated interaction, where Furhat and a human are sitting around a table. The accuracy of the perception of Furhat's gaze is measured depending on eye design, head movement and viewing angle. Third, we investigate the turn-taking accuracy of Furhat in a multi-party interactive setting, as compared to an animated agent on a flat screen. We conclude with some observations from a public setting at a museum, where Furhat interacted with thousands of visitors in a multi-party interaction.

  • 15.
    Al Moubayed, Samer
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Beskow, Jonas
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Stefanov, Kalin
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Gustafson, Joakim
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Multimodal Multiparty Social Interaction with the Furhat Head2012Konferensbidrag (Refereegranskat)
    Abstract [en]

    We will show in this demonstrator an advanced multimodal and multiparty spoken conversational system using Furhat, a robot head based on projected facial animation. Furhat is a human-like interface that utilizes facial animation for physical robot heads using back-projection. In the system, multimodality is enabled using speech and rich visual input signals such as multi-person real-time face tracking and microphone tracking. The demonstrator will showcase a system that is able to carry out social dialogue with multiple interlocutors simultaneously with rich output signals such as eye and head coordination, lips synchronized speech synthesis, and non-verbal facial gestures used to regulate fluent and expressive multiparty conversations.

  • 16.
    Beskow, Jonas
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Carlson, Rolf
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Edlund, Jens
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Granström, Björn
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Heldner, Mattias
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Hjalmarsson, Anna
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Multimodal Interaction Control2009Ingår i: Computers in the Human Interaction Loop / [ed] Waibel, Alexander; Stiefelhagen, Rainer, Berlin/Heidelberg: Springer Berlin/Heidelberg, 2009, s. 143-158Kapitel i bok, del av antologi (Refereegranskat)
  • 17.
    Beskow, Jonas
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Talteknologi, CTT.
    Edlund, Jens
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Talteknologi, CTT.
    Granström, Björn
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Talteknologi, CTT.
    Gustafson, Joakim
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Talteknologi, CTT.
    Jonsson, Oskar
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Talteknologi, CTT.
    Speech technology in the European project MonAMI2008Ingår i: Proceedings of FONETIK 2008 / [ed] Anders Eriksson, Jonas Lindh, Gothenburg, Sweden: University of Gothenburg , 2008, s. 33-36Konferensbidrag (Övrigt vetenskapligt)
    Abstract [en]

    This paper describes the role of speech and speech technology in the European project MonAMI, which aims at “mainstreaming ac-cessibility in consumer goods and services, us-ing advanced technologies to ensure equal ac-cess, independent living and participation for all”. It presents the Reminder, a prototype em-bodied conversational agent (ECA) which helps users to plan activities and to remember what to do. The prototype merges speech technology with other, existing technologies: Google Cal-endar and a digital pen and paper. The solution allows users to continue using a paper calendar in the manner they are used to, whilst the ECA provides notifications on what has been written in the calendar. Users may also ask questions such as “When was I supposed to meet Sara?” or “What’s on my schedule today?”

  • 18.
    Beskow, Jonas
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Edlund, Jens
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Granström, Björn
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Gustafson, Joakim
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Innovative interfaces in MonAMI: The Reminder2008Ingår i: Perception In Multimodal Dialogue Systems, Proceedings / [ed] Andre, E; Dybkjaer, L; Minker, W; Neumann, H; Pieraccini, R; Weber, M, 2008, Vol. 5078, s. 272-275Konferensbidrag (Refereegranskat)
    Abstract [en]

    This demo paper presents the first version of the Reminder, a prototype ECA developed in the European project MonAMI, which aims at "main-streaming accessibility in consumer goods and services, using advanced technologies to ensure equal access, independent living and participation for all". The Reminder helps users to plan activities and to remember what to do. The prototype merges ECA technology with other, existing technologies: Google Calendar and a digital pen and paper. This innovative combination of modalities allows users to continue using a paper calendar in the manner they are used to, whilst the ECA provides verbal notifications on what has been written in the calendar. Users may also ask questions such as "When was I supposed to meet Sara?" or "What's on my schedule today?"

  • 19.
    Beskow, Jonas
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Edlund, Jens
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Granström, Björn
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Gustafson, Joakim
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Tobiasson, Helena
    KTH, Skolan för datavetenskap och kommunikation (CSC), Människa-datorinteraktion, MDI (stängd 20111231).
    The MonAMI Reminder: a spoken dialogue system for face-to-face interaction2009Ingår i: Proceedings of the 10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009, Brighton, U.K, 2009, s. 300-303Konferensbidrag (Refereegranskat)
    Abstract [en]

    We describe the MonAMI Reminder, a multimodal spoken dialogue system which can assist elderly and disabled people in organising and initiating their daily activities. Based on deep interviews with potential users, we have designed a calendar and reminder application which uses an innovative mix of an embodied conversational agent, digital pen and paper, and the web to meet the needs of those users as well as the current constraints of speech technology. We also explore the use of head pose tracking for interaction and attention control in human-computer face-to-face interaction.

  • 20.
    Blomberg, Mats
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Al Moubayed, Samer
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Gustafson, Joakim
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Beskow, Jonas
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Granström, Björn
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Children and adults in dialogue with the robot head Furhat - corpus collection and initial analysis2012Ingår i: Proceedings of WOCCI, Portland, OR, 2012Konferensbidrag (Refereegranskat)
  • 21.
    Carlson, Rolf
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Edlund, Jens
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Heldner, Mattias
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Hjalmarsson, Anna
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    House, David
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Towards human-like behaviour in spoken dialog systems2006Ingår i: Proceedings of Swedish Language Technology Conference (SLTC 2006), Gothenburg, Sweden, 2006Konferensbidrag (Övrigt vetenskapligt)
    Abstract [en]

    We and others have found it fruitful to assume that users, when interacting with spoken dialogue systems, perceive the systems and their actions metaphorically. Common metaphors include the human metaphor and the interface metaphor (cf. Edlund, Heldner, & Gustafson, 2006). In the interface metaphor, the spoken dialogue system is perceived as a machine interface – often but not always a computer interface. Speech is used to accomplish what would have otherwise been accomplished by some other means of input, such as a keyboard or a mouse. In the human metaphor, on the other hand, the computer is perceived as a creature (or even a person) with humanlike conversational abilities, and speech is not a substitute or one of many alternatives, but rather the primary means of communicating with this creature. We are aware that more “natural ” or human-like behaviour does not automatically make a spoken dialogue system “better ” (i.e. more efficient or more well-liked by its users). Indeed, we are quite convinced that the advantage (or disadvantage) of humanlike behaviour will be highly dependent on the application. However, a dialogue system that is coherent with a human metaphor may profit from a number of characteristics.

  • 22. Cuayahuitl, Heriberto
    et al.
    Komatani, Kazunori
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Introduction for Speech and language for interactive robots2015Ingår i: Computer speech & language (Print), ISSN 0885-2308, E-ISSN 1095-8363, Vol. 34, nr 1, s. 83-86Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    This special issue includes research articles which apply spoken language processing to robots that interact with human users through speech, possibly combined with other modalities. Robots that can listen to human speech, understand it, interact according to the conveyed meaning, and respond represent major research and technological challenges. Their common aim is to equip robots with natural interaction abilities. However, robotics and spoken language processing are areas that are typically studied within their respective communities with limited communication across disciplinary boundaries. The articles in this special issue represent examples that address the need for an increased multidisciplinary exchange of ideas.

  • 23.
    Edlund, Jens
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    House, David
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Prosodic Features in the Perception of Clarification Ellipses2005Ingår i: Proceedings of Fonetik 2005: The XVIIIth Swedish Phonetics Conference, Gothenburg, Sweden, 2005, s. 107-110Konferensbidrag (Övrigt vetenskapligt)
    Abstract [en]

    We present an experiment where subjects were asked to listen to Swedish human-computer dialogue fragments where a synthetic voice makes an elliptical clarification after a user turn. The prosodic features of the synthetic voice were systematically varied, and subjects were asked to judge the computer's actual intention. The results show that an early low F0 peak signals acceptance, that a late high peak is perceived as a request for clarification of what was said, and that a mid high peak is perceived as a request for clarification of the meaning of what was said. The study can be seen as the beginnings of a tentative model for intonation of clarification ellipses in Swedish, which can be implemented and tested in spoken dialogue systems.

  • 24.
    Edlund, Jens
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    House, David
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    The effects of prosodic features on the interpretation of clarification ellipses2005Ingår i: Proceedings of Interspeech 2005: Eurospeech, 2005, s. 2389-2392Konferensbidrag (Refereegranskat)
    Abstract [en]

    In this paper, the effects of prosodic features on the interpretation of elliptical clarification requests in dialogue are studied. An experiment is presented where subjects were asked to listen to short human-computer dialogue fragments in Swedish, where a synthetic voice was making an elliptical clarification after a user turn. The prosodic features of the synthetic voice were systematically varied, and the subjects were asked to judge what was actually intended by the computer. The results show that an early low F0 peak signals acceptance, that a late high peak is perceived as a request for clarification of what was said, and that a mid high peak is perceived as a request for clarification of the meaning of what was said. The study can be seen as the beginnings of a tentative model for intonation of clarification ellipses in Swedish, which can be implemented and tested in spoken dialogue systems.

  • 25.
    Edlund, Jens
    et al.
    KTH, Tidigare Institutioner, Tal, musik och hörsel.
    Skantze, Gabriel
    KTH, Tidigare Institutioner, Tal, musik och hörsel.
    Carlson, Rolf
    KTH, Tidigare Institutioner, Tal, musik och hörsel.
    Higgins: a spoken dialogue system for investigating error handling techniques2004Ingår i: Proceedings of the International Conference on Spoken Language Processing, ICSLP 04, 2004, s. 229-231Konferensbidrag (Refereegranskat)
    Abstract [en]

    In this paper, an overview of the Higgins project and the research within the project is presented. The project incorporates studies of error handling for spoken dialogue systems on several levels, from processing to dialogue level. A domain in which a range of different error types can be studied has been chosen: pedestrian navigation and guiding. Several data collections within Higgins have been analysed along with data from Higgins' predecessor, the AdApt system. The error handling research issues in the project are presented in light of these analyses.

  • 26. Georgiladakis, S.
    et al.
    Athanasopoulou, G.
    Meena, Raveesh
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH. KTH, Tal-kommunikation.
    David Lopes, José
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Chorianopoulou, A.
    Palogiannidi, E.
    Iosif, E.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH. KTH, Tal-kommunikation.
    Potamianos, A.
    Root Cause Analysis of Miscommunication Hotspots in Spoken Dialogue Systems2016Ingår i: Interspeech 2016, San Francisco, CA, USA: International Speech Communication Association, 2016Konferensbidrag (Refereegranskat)
    Abstract [en]

    A major challenge in Spoken Dialogue Systems (SDS) is the detection of problematic communication (hotspots), as well as the classification of these hotspots into different types (root cause analysis). In this work, we focus on two classes of root cause, namely, erroneous speech recognition vs. other (e.g., dialogue strategy). Specifically, we propose an automatic algorithm for detecting hotspots and classifying root causes in two subsequent steps. Regarding hotspot detection, various lexico-semantic features are used for capturing repetition patterns along with affective features. Lexico-semantic and repetition features are also employed for root cause analysis. Both algorithms are evaluated with respect to the Let’s Go dataset (bus information system). In terms of classification unweighted average recall, performance of 80% and 70% is achieved for hotspot detection and root cause analysis, respectively.

  • 27. Johansson, M.
    et al.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Gustafson, Joakim
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Understanding route directions in human-robot dialogue2011Ingår i: Proceedings of SemDial, Los Angeles, CA, 2011, s. 19-27Konferensbidrag (Refereegranskat)
  • 28.
    Johansson, Martin
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Hori, Tatsuro
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Hothker, Anja
    Gustafson, Joakirn
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Making Turn-Taking Decisions for an Active Listening Robot for Memory Training2016Ingår i: SOCIAL ROBOTICS, (ICSR 2016), Springer, 2016, s. 940-949Konferensbidrag (Refereegranskat)
    Abstract [en]

    In this paper we present a dialogue system and response model that allows a robot to act as an active listener, encouraging users to tell the robot about their travel memories. The response model makes a combined decision about when to respond and what type of response to give, in order to elicit more elaborate descriptions from the user and avoid non-sequitur responses. The model was trained on human-robot dialogue data collected in a Wizard-of-Oz setting, and evaluated in a fully autonomous version of the same dialogue system. Compared to a baseline system, users perceived the dialogue system with the trained model to be a significantly better listener. The trained model also resulted in dialogues with significantly fewer mistakes, a larger proportion of user speech and fewer interruptions.

  • 29. Johansson, Martin
    et al.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Opportunities and obligations to take turns in collaborative multi-party human-robot interaction2015Ingår i: SIGDIAL 2015 - 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Proceedings of the Conference, Association for Computational Linguistics, 2015, s. 305-314Konferensbidrag (Refereegranskat)
    Abstract [en]

    In this paper we present a data-driven model for detecting opportunities and obligations for a robot to take turns in multi-party discussions about objects. The data used for the model was collected in a public setting, where the robot head Furhat played a collaborative card sorting game together with two users. The model makes a combined detection of addressee and turn-yielding cues, using multi-modal data from voice activity, syntax, prosody, head pose, movement of cards, and dialogue context. The best result for a binary decision is achieved when several modalities are combined, giving a weighted F1 score of 0.876 on data from a previously unseen interaction, using only automatically extractable features.

  • 30.
    Johansson, Martin
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Gustafson, Joakim
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Comparison of human-human and human-robot Turn-taking Behaviour in multi-party Situated interaction2014Ingår i: UM3I '14: Proceedings of the 2014 workshop on Understanding and Modeling Multiparty, Multimodal Interactions, Istanbul, Turkey, 2014, s. 21-26Konferensbidrag (Refereegranskat)
    Abstract [en]

    In this paper, we present an experiment where two human subjects are given a team-building task to solve together with a robot. The setting requires that the speakers' attention is partly directed towards objects on the table between them, as well as to each other, in order to coordinate turn-taking. The symmetrical setup allows us to compare human-human and human-robot turn-taking behaviour in the same interactional setting. The analysis centres around the interlocutors' attention (as measured by head pose) and gap length between turns, depending on the pragmatic function of the utterances.

  • 31.
    Johansson, Martin
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Gustafson, Joakim
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Head Pose Patterns in Multiparty Human-Robot Team-Building Interactions2013Ingår i: Social Robotics: 5th International Conference, ICSR 2013, Bristol, UK, October 27-29, 2013, Proceedings / [ed] Guido Herrmann, Martin J. Pearson, Alexander Lenz, Paul Bremner, Adam Spiers, Ute Leonards, Springer, 2013, s. 351-360Konferensbidrag (Refereegranskat)
    Abstract [en]

    We present a data collection setup for exploring turn-taking in three-party human-robot interaction involving objects competing for attention. The collected corpus comprises 78 minutes in four interactions. Using automated techniques to record head pose and speech patterns, we analyze head pose patterns in turn-transitions. We find that introduction of objects makes addressee identification based on head pose more challenging. The symmetrical setup also allows us to compare human-human to human-robot behavior within the same interaction. We argue that this symmetry can be used to assess to what extent the system exhibits a human-like behavior.

  • 32.
    Johnson-Roberson, Matthew
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    Bohg, Jeannette
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    Kragic, Danica
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Gustafson, Joakim
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Carlson, Rolf
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Enhanced visual scene understanding through human-robot dialog2010Ingår i: Dialog with Robots: AAAI 2010 Fall Symposium, 2010, s. -144Konferensbidrag (Refereegranskat)
  • 33.
    Johnson-Roberson, Matthew
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    Bohg, Jeannette
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Gustafsson, Joakim
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Carlson, Rolf
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Kragic, Danica
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    Rasolzadeh, Babak
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    Enhanced Visual Scene Understanding through Human-Robot Dialog2011Ingår i: 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, IEEE , 2011, s. 3342-3348Konferensbidrag (Refereegranskat)
    Abstract [en]

    We propose a novel human-robot-interaction framework for robust visual scene understanding. Without any a-priori knowledge about the objects, the task of the robot is to correctly enumerate how many of them are in the scene and segment them from the background. Our approach builds on top of state-of-the-art computer vision methods, generating object hypotheses through segmentation. This process is combined with a natural dialog system, thus including a ‘human in the loop’ where, by exploiting the natural conversation of an advanced dialog system, the robot gains knowledge about ambiguous situations. We present an entropy-based system allowing the robot to detect the poorest object hypotheses and query the user for arbitration. Based on the information obtained from the human-robot dialog, the scene segmentation can be re-seeded and thereby improved. We present experimental results on real data that show an improved segmentation performance compared to segmentation without interaction.

  • 34.
    Kontogiorgos, Dimosthenis
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH.
    Avramova, Vanya
    KTH.
    Alexanderson, Simon
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH.
    Jonell, Patrik
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH.
    Oertel, Catharine
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH. Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland.
    Beskow, Jonas
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH.
    Skantze, Gabriel
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH.
    Gustafson, Joakim
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH.
    A Multimodal Corpus for Mutual Gaze and Joint Attention in Multiparty Situated Interaction2018Ingår i: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Paris, 2018, s. 119-127Konferensbidrag (Refereegranskat)
    Abstract [en]

    In this paper we present a corpus of multiparty situated interaction where participants collaborated on moving virtual objects on a large touch screen. A moderator facilitated the discussion and directed the interaction. The corpus contains recordings of a variety of multimodal data, in that we captured speech, eye gaze and gesture data using a multisensory setup (wearable eye trackers, motion capture and audio/video). Furthermore, in the description of the multimodal corpus, we investigate four different types of social gaze: referential gaze, joint attention, mutual gaze and gaze aversion by both perspectives of a speaker and a listener. We annotated the groups’ object references during object manipulation tasks and analysed the group’s proportional referential eye-gaze with regards to the referent object. When investigating the distributions of gaze during and before referring expressions we could corroborate the differences in time between speakers’ and listeners’ eye gaze found in earlier studies. This corpus is of particular interest to researchers who are interested in social eye-gaze patterns in turn-taking and referring language in situated multi-party interaction.

  • 35.
    Kontogiorgos, Dimosthenis
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH.
    Sibirtseva, Elena
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Robotik, perception och lärande, RPL.
    Pereira, André
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH.
    Skantze, Gabriel
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH.
    Gustafson, Joakim
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH.
    Multimodal reference resolution in collaborative assembly tasks2018Ingår i: Multimodal reference resolution in collaborative assembly tasks, ACM Digital Library, 2018Konferensbidrag (Refereegranskat)
  • 36.
    Kontogiorgos, Dimosthenis
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH.
    Skantze, Gabriel
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH.
    Abelho Pereira, André Tiago
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH.
    Gustafson, Joakim
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH.
    The Effects of Embodiment and Social Eye-Gaze in Conversational Agents2019Ingår i: Proceedings of the 41st Annual Conference of the Cognitive Science Society (CogSci), 2019Konferensbidrag (Refereegranskat)
    Abstract [en]

    The adoption of conversational agents is growing at a rapid pace. Agents however, are not optimised to simulate key social aspects of situated human conversational environments. Humans are intellectually biased towards social activity when facing more anthropomorphic agents or when presented with subtle social cues. In this work, we explore the effects of simulating anthropomorphism and social eye-gaze in three conversational agents. We tested whether subjects’ visual attention would be similar to agents in different forms of embodiment and social eye-gaze. In a within-subject situated interaction study (N=30), we asked subjects to engage in task-oriented dialogue with a smart speaker and two variations of a social robot. We observed shifting of interactive behaviour by human users, as shown in differences in behavioural and objective measures. With a trade-off in task performance, social facilitation is higher with more anthropomorphic social agents when performing the same task.

  • 37.
    Li, Chengjie
    et al.
    KTH.
    Androulakaki, Theofronia
    KTH.
    Gao, Alex Yuan
    Yang, Fangkai
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Saikia, Himangshu
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Peters, Christopher
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Skantze, Gabriel
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH.
    Effects of Posture and Embodiment on Social Distance in Human-Agent Interaction in Mixed Reality2018Ingår i: Proceedings of the 18th International Conference on Intelligent Virtual Agents, ACM Digital Library, 2018, s. 191-196Konferensbidrag (Refereegranskat)
    Abstract [en]

    Mixed reality offers new potentials for social interaction experiences with virtual agents. In addition, it can be used to experiment with the design of physical robots. However, while previous studies have investigated comfortable social distances between humans and artificial agents in real and virtual environments, there is little data with regards to mixed reality environments. In this paper, we conducted an experiment in which participants were asked to walk up to an agent to ask a question, in order to investigate the social distances maintained, as well as the subject's experience of the interaction. We manipulated both the embodiment of the agent (robot vs. human and virtual vs. physical) as well as closed vs. open posture of the agent. The virtual agent was displayed using a mixed reality headset. Our experiment involved 35 participants in a within-subject design. We show that, in the context of social interactions, mixed reality fares well against physical environments, and robots fare well against humans, barring a few technical challenges.

  • 38.
    Lopes, José
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Engwall, Olov
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    A First Visit to the Robot Language Café2017Ingår i: Proceedings of the ISCA workshop on Speech and Language Technology in Education / [ed] Engwall, Lopes, Stockholm, 2017Konferensbidrag (Refereegranskat)
    Abstract [en]

    We present an exploratory study on using a social robot in a conversational setting to practice a second language. The prac- tice is carried out within a so called language cafe ́, with two second language learners and one native moderator; a human or a robot; engaging in social small talk. We compare the in- teractions with the human and robot moderators and perform a qualitative analysis of the potentials of a social robot as a con- versational partner for language learning. Interactions with the robot are carried out in a wizard-of-Oz setting, in which the native moderator who leads the corresponding human moder- ator session controls the robot. The observations of the video recorded sessions and the subject questionnaires suggest that the appropriate learner level for the practice is elementary (A1 to A21), for whom the structured, slightly repetitive interaction pattern was perceived as beneficial. We identify both some key features that are appreciated by the learners and technological parts that need further development. 

  • 39.
    Lopes, José
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Salvi, Giampiero
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Abad, A.
    Gustafson, Joakim
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Batista, F.
    Meena, Raveesh
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Trancoso, I.
    Detecting Repetitions in Spoken Dialogue Systems Using Phonetic Distances2015Ingår i: INTERSPEECH-2015, 2015, s. 1805-1809Konferensbidrag (Refereegranskat)
    Abstract [en]

    Repetitions in Spoken Dialogue Systems can be a symptom of problematic communication. Such repetitions are often due to speech recognition errors, which in turn makes it harder to use the output of the speech recognizer to detect repetitions. In this paper, we combine the alignment score obtained using phonetic distances with dialogue-related features to improve repetition detection. To evaluate the method proposed we compare several alignment techniques from edit distance to DTW-based distance, previously used in Spoken-Term detection tasks. We also compare two different methods to compute the phonetic distance: the first one using the phoneme sequence, and the second one using the distance between the phone posterior vectors. Two different datasets were used in this evaluation: a bus-schedule information system (in English) and a call routing system (in Swedish). The results show that approaches using phoneme distances over-perform approaches using Levenshtein distances between ASR outputs for repetition detection.

  • 40.
    Meena, Raveesh
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Boye, Johan
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Gustafson, Joakim
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Crowdsourcing Street-level Geographic Information Using a Spoken Dialogue System2014Ingår i: Proceedings of the SIGDIAL 2014 Conference, Association for Computational Linguistics, 2014, s. 2-11Konferensbidrag (Refereegranskat)
    Abstract [en]

    We present a technique for crowd-sourcing street-level geographic information using spoken natural language. In particular, we are interested in obtaining first-person-view information about what can be seen from different positions in the city. This information can then for example be used for pedestrian routing services. The approach has been tested in the lab using a fully implemented spoken dialogue system, and is showing promising results.

  • 41.
    Meena, Raveesh
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Boye, Johan
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Gustafson, Joakim
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Using a Spoken Dialogue System for Crowdsourcing Street-level Geographic Information2014Konferensbidrag (Refereegranskat)
    Abstract [en]

    We present a novel scheme for enriching geographic database with street-level geographic information that could be useful for pedestrian navigation. A spoken dialogue system for crowdsourcing street-level geographic details was developed and tested in an in-lab experimentation, and has shown promising results.

  • 42.
    Meena, Raveesh
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    David Lopes, José
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Gustafson, Joakim
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Automatic Detection of Miscommunication in Spoken Dialogue Systems2015Ingår i: Proceedings of 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL), 2015, s. 354-363Konferensbidrag (Refereegranskat)
    Abstract [en]

    In this paper, we present a data-driven approach for detecting instances of miscommunication in dialogue system interactions. A range of generic features that are both automatically extractable and manually annotated were used to train two models for online detection and one for offline analysis. Online detection could be used to raise the error awareness of the system, whereas offline detection could be used by a system designer to identify potential flaws in the dialogue design. In experimental evaluations on system logs from three different dialogue systems that vary in their dialogue strategy, the proposed models performed substantially better than the majority class baseline models.

  • 43.
    Meena, Raveesh
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Gustafson, Joakim
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    A Chunking Parser for Semantic Interpretation of Spoken Route Directions in Human-Robot Dialogue2012Ingår i: Proceedings of the 4th Swedish Language Technology Conference (SLTC 2012), Lund, Sweden, 2012, s. 55-56Konferensbidrag (Refereegranskat)
    Abstract [en]

    We present a novel application of the chunking parser for data-driven semantic interpretation of spoken route directions into route graphs that are useful for robot navigation. Various sets of features and machine learning algorithms were explored. The results indicate that our approach is robust to speech recognition errors, and could be easily used in other languages using simple features.

  • 44.
    Meena, Raveesh
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Gustafson, Joakim
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    A data-driven approach to understanding spoken route directions in human-robot dialogue2012Ingår i: 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012, 2012, s. 226-229Konferensbidrag (Refereegranskat)
    Abstract [en]

    In this paper, we present a data-driven chunking parser for automatic interpretation of spoken route directions into a route graph that is useful for robot navigation. Different sets of features and machine learning algorithms are explored. The results indicate that our approach is robust to speech recognition errors.

  • 45.
    Meena, Raveesh
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Gustafson, Joakim
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    A Data-driven Model for Timing Feedback in a Map Task Dialogue System2013Ingår i: 14th Annual Meeting of the Special Interest Group on Discourse and Dialogue - SIGdial, Metz, France, 2013, s. 375-383Konferensbidrag (Refereegranskat)
    Abstract [en]

    We present a data-driven model for detecting suitable response locations in the user’s speech. The model has been trained on human–machine dialogue data and implemented and tested in a spoken dialogue system that can perform the Map Task with users. To our knowledge, this is the first example of a dialogue system that uses automatically extracted syntactic, prosodic and contextual features for online detection of response locations. A subjective evaluation of the dialogue system suggests that interactions with a system using our trained model were perceived significantly better than those with a system using a model that made decisions at random.

  • 46.
    Meena, Raveesh
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Gustafson, Joakim
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Human Evaluation of Conceptual Route Graphs for Interpreting Spoken Route Descriptions2013Ingår i: Proceedings of the 3rd International Workshop on Computational Models of Spatial Language Interpretation and Generation (CoSLI), Potsdam, Germany, 2013, s. 30-35Konferensbidrag (Refereegranskat)
    Abstract [en]

    We present a human evaluation of the usefulness of conceptual route graphs (CRGs) when it comes to route following using spoken route descriptions. We describe a method for data-driven semantic interpretation of route de-scriptions into CRGs. The comparable performances of human participants in sketching a route using the manually transcribed CRGs and the CRGs produced on speech recognized route descriptions indicate the robustness of our method in preserving the vital conceptual information required for route following despite speech recognition errors.

  • 47.
    Meena, Raveesh
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Gustafson, Joakim
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    The Map Task Dialogue System: A Test-bed for Modelling Human-Like Dialogue2013Ingår i: 14th Annual Meeting of the Special Interest Group on Discourse and Dialogue - SIGdial, Metz, France, 2013, s. 366-368Konferensbidrag (Refereegranskat)
    Abstract [en]

    The demonstrator presents a test-bed for collecting data on human–computer dialogue: a fully automated dialogue system that can perform Map Task with a user. In a first step, we have used the test-bed to collect human–computer Map Task dialogue data, and have trained various data-driven models on it for detecting feedback response locations in the user’s speech. One of the trained models has been tested in user interactions and was perceived better in comparison to a system using a random model. The demonstrator will exhibit three versions of the Map Task dialogue system—each using a different trained data-driven model of Response Location Detection.

  • 48.
    Meena, Raveesh
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Gustafsson, Joakim
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Tal-kommunikation.
    Data-driven models for timing feedback responses in a Map Task dialogue system2014Ingår i: Computer speech & language (Print), ISSN 0885-2308, E-ISSN 1095-8363, Vol. 28, nr 4, s. 903-922Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Traditional dialogue systems use a fixed silence threshold to detect the end of users' turns. Such a simplistic model can result in system behaviour that is both interruptive and unresponsive, which in turn affects user experience. Various studies have observed that human interlocutors take cues from speaker behaviour, such as prosody, syntax, and gestures, to coordinate smooth exchange of speaking turns. However, little effort has been made towards implementing these models in dialogue systems and verifying how well they model the turn-taking behaviour in human computer interactions. We present a data-driven approach to building models for online detection of suitable feedback response locations in the user's speech. We first collected human computer interaction data using a spoken dialogue system that can perform the Map Task with users (albeit using a trick). On this data, we trained various models that use automatically extractable prosodic, contextual and lexico-syntactic features for detecting response locations. Next, we implemented a trained model in the same dialogue system and evaluated it in interactions with users. The subjective and objective measures from the user evaluation confirm that a model trained on speaker behavioural cues offers both smoother turn-transitions and more responsive system behaviour.

  • 49. Mirnig, Nicole
    et al.
    Weiss, Astrid
    Skantze, Gabriel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Al Moubayed, Samer
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Gustafson, Joakim
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Beskow, Jonas
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Granström, Björn
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Tscheligi, Manfred
    Face-To-Face With A Robot: What do we actually talk about?2013Ingår i: International Journal of Humanoid Robotics, ISSN 0219-8436, Vol. 10, nr 1, s. 1350011-Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    While much of the state-of-the-art research in human-robot interaction (HRI) investigates task-oriented interaction, this paper aims at exploring what people talk about to a robot if the content of the conversation is not predefined. We used the robot head Furhat to explore the conversational behavior of people who encounter a robot in the public setting of a robot exhibition in a scientific museum, but without a predefined purpose. Upon analyzing the conversations, it could be shown that a sophisticated robot provides an inviting atmosphere for people to engage in interaction and to be experimental and challenge the robot's capabilities. Many visitors to the exhibition were willing to go beyond the guiding questions that were provided as a starting point. Amongst other things, they asked Furhat questions concerning the robot itself, such as how it would define a robot, or if it plans to take over the world. People were also interested in the feelings and likes of the robot and they asked many personal questions - this is how Furhat ended up with its first marriage proposal. People who talked to Furhat were asked to complete a questionnaire on their assessment of the conversation, with which we could show that the interaction with Furhat was rated as a pleasant experience.

  • 50.
    Peters, Christopher
    et al.
    KTH.
    Li, Chengjie
    KTH.
    Yang, Fangkai
    KTH.
    Avramova, Vanya
    KTH.
    Skantze, Gabriel
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH.
    Investigating Social Distances between Humans, Virtual Humans and Virtual Robots in Mixed Reality2018Ingår i: Proceedings of 17th International Conference on Autonomous Agents and MultiAgent Systems, 2018, s. 2247-2249Konferensbidrag (Refereegranskat)
    Abstract [en]

    Mixed reality environments offer new potentials for the design of compelling social interaction experiences with virtual characters. In this paper, we summarise initial experiments we are conducting in which we measure comfortable social distances between humans, virtual humans and virtual robots in mixed reality environments. We consider a scenario in which participants walk within a comfortable distance of a virtual character that has its appearance varied between a male and female human, and a standard- and human-height virtual Pepper robot. Our studies in mixed reality thus far indicate that humans adopt social zones with artificial agents that are similar in manner to human-human social interactions and interactions in virtual reality.

12 1 - 50 av 87
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf