kth.sePublications
Change search
Link to record
Permanent link

Direct link
Publications (10 of 30) Show all publications
Abelho Pereira, A. T., Oertel, C., Fermoselle, L., Mendelson, J. & Gustafson, J. (2019). Responsive Joint Attention in Human-Robot Interaction. In: Proceedings 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2019: . Paper presented at 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2019, Macau, SAR, China, November 3-8, 2019 (pp. 1080-1087). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Responsive Joint Attention in Human-Robot Interaction
Show others...
2019 (English)In: Proceedings 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2019, Institute of Electrical and Electronics Engineers (IEEE) , 2019, p. 1080-1087Conference paper, Published paper (Refereed)
Abstract [en]

Joint attention has been shown to be not only crucial for human-human interaction but also human-robot interaction. Joint attention can help to make cooperation more efficient, support disambiguation in instances of uncertainty and make interactions appear more natural and familiar. In this paper, we present an autonomous gaze system that uses multimodal perception capabilities to model responsive joint attention mechanisms. We investigate the effects of our system on people’s perception of a robot within a problem-solving task. Results from a user study suggest that responsive joint attention mechanisms evoke higher perceived feelings of social presence on scales that regard the direction of the robot’s perception.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2019
National Category
Other Engineering and Technologies
Identifiers
urn:nbn:se:kth:diva-267228 (URN)10.1109/IROS40897.2019.8968130 (DOI)000544658400117 ()2-s2.0-85081154022 (Scopus ID)
Conference
2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2019, Macau, SAR, China, November 3-8, 2019
Note

QC 20200217

Available from: 2020-02-04 Created: 2020-02-04 Last updated: 2025-02-18Bibliographically approved
Kontogiorgos, D., Avramova, V., Alexanderson, S., Jonell, P., Oertel, C., Beskow, J., . . . Gustafson, J. (2018). A Multimodal Corpus for Mutual Gaze and Joint Attention in Multiparty Situated Interaction. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018): . Paper presented at International Conference on Language Resources and Evaluation (LREC 2018) (pp. 119-127). Paris
Open this publication in new window or tab >>A Multimodal Corpus for Mutual Gaze and Joint Attention in Multiparty Situated Interaction
Show others...
2018 (English)In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Paris, 2018, p. 119-127Conference paper, Published paper (Refereed)
Abstract [en]

In this paper we present a corpus of multiparty situated interaction where participants collaborated on moving virtual objects on a large touch screen. A moderator facilitated the discussion and directed the interaction. The corpus contains recordings of a variety of multimodal data, in that we captured speech, eye gaze and gesture data using a multisensory setup (wearable eye trackers, motion capture and audio/video). Furthermore, in the description of the multimodal corpus, we investigate four different types of social gaze: referential gaze, joint attention, mutual gaze and gaze aversion by both perspectives of a speaker and a listener. We annotated the groups’ object references during object manipulation tasks and analysed the group’s proportional referential eye-gaze with regards to the referent object. When investigating the distributions of gaze during and before referring expressions we could corroborate the differences in time between speakers’ and listeners’ eye gaze found in earlier studies. This corpus is of particular interest to researchers who are interested in social eye-gaze patterns in turn-taking and referring language in situated multi-party interaction.

Place, publisher, year, edition, pages
Paris: , 2018
National Category
Engineering and Technology
Identifiers
urn:nbn:se:kth:diva-230238 (URN)000725545000019 ()2-s2.0-85059891166 (Scopus ID)
Conference
International Conference on Language Resources and Evaluation (LREC 2018)
Note

Part of proceedings ISBN 979-10-95546-00-9

QC 20180614

Available from: 2018-06-13 Created: 2018-06-13 Last updated: 2022-11-09Bibliographically approved
Jonell, P., Oertel, C., Kontogiorgos, D., Beskow, J. & Gustafson, J. (2018). Crowdsourced Multimodal Corpora Collection Tool. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018): . Paper presented at The Eleventh International Conference on Language Resources and Evaluation (LREC 2018) (pp. 728-734). Paris
Open this publication in new window or tab >>Crowdsourced Multimodal Corpora Collection Tool
Show others...
2018 (English)In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Paris, 2018, p. 728-734Conference paper, Published paper (Refereed)
Abstract [en]

In recent years, more and more multimodal corpora have been created. To our knowledge there is no publicly available tool which allows for acquiring controlled multimodal data of people in a rapid and scalable fashion. We therefore are proposing (1) a novel tool which will enable researchers to rapidly gather large amounts of multimodal data spanning a wide demographic range, and (2) an example of how we used this tool for corpus collection of our "Attentive listener'' multimodal corpus. The code is released under an Apache License 2.0 and available as an open-source repository, which can be found at https://github.com/kth-social-robotics/multimodal-crowdsourcing-tool. This tool will allow researchers to set-up their own multimodal data collection system quickly and create their own multimodal corpora. Finally, this paper provides a discussion about the advantages and disadvantages with a crowd-sourced data collection tool, especially in comparison to a lab recorded corpora.

Place, publisher, year, edition, pages
Paris: , 2018
National Category
Engineering and Technology
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-230236 (URN)000725545000117 ()2-s2.0-85059908776 (Scopus ID)
Conference
The Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
Note

Part of proceedings ISBN 979-10-95546-00-9

QC 20180618

Available from: 2018-06-13 Created: 2018-06-13 Last updated: 2022-11-09Bibliographically approved
Jonell, P., Bystedt, M., Fallgren, P., Kontogiorgos, D., David Aguas Lopes, J., Malisz, Z., . . . Shore, T. (2018). FARMI: A Framework for Recording Multi-Modal Interactions. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018): . Paper presented at The Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan, 7-12 May 2018 (pp. 3969-3974). Paris: European Language Resources Association
Open this publication in new window or tab >>FARMI: A Framework for Recording Multi-Modal Interactions
Show others...
2018 (English)In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Paris: European Language Resources Association, 2018, p. 3969-3974Conference paper, Published paper (Refereed)
Abstract [en]

In this paper we present (1) a processing architecture used to collect multi-modal sensor data, both for corpora collection and real-time processing, (2) an open-source implementation thereof and (3) a use-case where we deploy the architecture in a multi-party deception game, featuring six human players and one robot. The architecture is agnostic to the choice of hardware (e.g. microphones, cameras, etc.) and programming languages, although our implementation is mostly written in Python. In our use-case, different methods of capturing verbal and non-verbal cues from the participants were used. These were processed in real-time and used to inform the robot about the participants’ deceptive behaviour. The framework is of particular interest for researchers who are interested in the collection of multi-party, richly recorded corpora and the design of conversational systems. Moreover for researchers who are interested in human-robot interaction the available modules offer the possibility to easily create both autonomous and wizard-of-Oz interactions.

Place, publisher, year, edition, pages
Paris: European Language Resources Association, 2018
National Category
Natural Sciences Engineering and Technology
Identifiers
urn:nbn:se:kth:diva-230237 (URN)000725545004009 ()2-s2.0-85058179983 (Scopus ID)
Conference
The Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan, 7-12 May 2018
Note

Part of proceedings ISBN 979-10-95546-00-9

QC 20180618

Available from: 2018-06-13 Created: 2018-06-13 Last updated: 2022-09-22Bibliographically approved
Jonell, P., Oertel, C., Kontogiorgos, D., Beskow, J. & Gustafson, J. (2017). Crowd-powered design of virtual attentive listeners. In: 17th International Conference on Intelligent Virtual Agents, IVA 2017: . Paper presented at 17th International Conference on Intelligent Virtual Agents, IVA 2017, Stockholm, Sweden, 27 August 2017 through 30 August 2017 (pp. 188-191). Springer, 10498
Open this publication in new window or tab >>Crowd-powered design of virtual attentive listeners
Show others...
2017 (English)In: 17th International Conference on Intelligent Virtual Agents, IVA 2017, Springer, 2017, Vol. 10498, p. 188-191Conference paper, Published paper (Refereed)
Abstract [en]

This demo presents a web-based system that generates attentive listening behaviours in a virtual agent acquired from audio-visual recordings of attitudinal feedback behaviour of crowdworkers.

Place, publisher, year, edition, pages
Springer, 2017
Series
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), ISSN 0302-9743 ; 10498
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-216357 (URN)10.1007/978-3-319-67401-8_21 (DOI)000455400000021 ()2-s2.0-85028975732 (Scopus ID)9783319674001 (ISBN)
Conference
17th International Conference on Intelligent Virtual Agents, IVA 2017, Stockholm, Sweden, 27 August 2017 through 30 August 2017
Note

QC 20171023

Available from: 2017-10-23 Created: 2017-10-23 Last updated: 2024-03-15Bibliographically approved
Oertel, C., Jonell, P., Kontogiorgos, D., Mendelson, J., Beskow, J. & Gustafson, J. (2017). Crowdsourced design of artificial attentive listeners. In: : . Paper presented at INTERSPEECH: Situated Interaction, Augusti 20-24 Augusti, 2017.
Open this publication in new window or tab >>Crowdsourced design of artificial attentive listeners
Show others...
2017 (English)Conference paper, Published paper (Refereed)
National Category
Engineering and Technology
Identifiers
urn:nbn:se:kth:diva-215505 (URN)
Conference
INTERSPEECH: Situated Interaction, Augusti 20-24 Augusti, 2017
Note

QC 20171011

Available from: 2017-10-10 Created: 2017-10-10 Last updated: 2024-03-15Bibliographically approved
Oertel, C., Jonell, P., Kontogiorgos, D., Mendelson, J., Beskow, J. & Gustafson, J. (2017). Crowd-Sourced Design of Artificial Attentive Listeners. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH: . Paper presented at 18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017; Stockholm; Sweden; 20 August 2017 through 24 August 2017 (pp. 854-858). International Speech Communication Association, 2017
Open this publication in new window or tab >>Crowd-Sourced Design of Artificial Attentive Listeners
Show others...
2017 (English)In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, International Speech Communication Association, 2017, Vol. 2017, p. 854-858Conference paper, Published paper (Refereed)
Abstract [en]

Feedback generation is an important component of humanhuman communication. Humans can choose to signal support, understanding, agreement or also sceptiscism by means of feedback tokens. Many studies have focused on the timing of feedback behaviours. In the current study, however, we keep the timing constant and instead focus on the lexical form and prosody of feedback tokens as well as their sequential patterns. For this we crowdsourced participant's feedback behaviour in identical interactional contexts in order to model a virtual agent that is able to provide feedback as an attentive/supportive as well as attentive/sceptical listener. The resulting models were realised in a robot which was evaluated by third-party observers.

Place, publisher, year, edition, pages
International Speech Communication Association, 2017
Series
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, ISSN 2308-457X ; 2017
National Category
Natural Language Processing
Identifiers
urn:nbn:se:kth:diva-268357 (URN)10.21437/Interspeech.2017-926 (DOI)000457505000181 ()2-s2.0-85028998444 (Scopus ID)978-1-5108-4876-4 (ISBN)
Conference
18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017; Stockholm; Sweden; 20 August 2017 through 24 August 2017
Note

QC 20200703

Available from: 2020-02-18 Created: 2020-02-18 Last updated: 2025-02-07Bibliographically approved
Oertel, C., Jonell, P., Haddad, K. E., Szekely, E. & Gustafson, J. (2017). Using crowd-sourcing for the design of listening agents: Challenges and opportunities. In: ISIAA 2017 - Proceedings of the 1st ACM SIGCHI International Workshop on Investigating Social Interactions with Artificial Agents, Co-located with ICMI 2017: . Paper presented at 1st ACM SIGCHI International Workshop on Investigating Social Interactions with Artificial Agents, ISIAA 2017, Glasgow, United Kingdom, 13 November 2017 (pp. 37-38). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>Using crowd-sourcing for the design of listening agents: Challenges and opportunities
Show others...
2017 (English)In: ISIAA 2017 - Proceedings of the 1st ACM SIGCHI International Workshop on Investigating Social Interactions with Artificial Agents, Co-located with ICMI 2017, Association for Computing Machinery (ACM), 2017, p. 37-38Conference paper, Published paper (Refereed)
Abstract [en]

In this paper we are describing how audio-visual corpora recordings using crowd-sourcing techniques can be used for the audio-visual synthesis of attitudinal non-verbal feedback expressions for virtual agents. We are discussing the limitations of this approach as well as where we see the opportunities for this technology.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2017
Keywords
Artificial listener, Listening agent, Multimodal behaviour generation
National Category
Other Engineering and Technologies
Identifiers
urn:nbn:se:kth:diva-222507 (URN)10.1145/3139491.3139499 (DOI)2-s2.0-85041230172 (Scopus ID)9781450355582 (ISBN)
Conference
1st ACM SIGCHI International Workshop on Investigating Social Interactions with Artificial Agents, ISIAA 2017, Glasgow, United Kingdom, 13 November 2017
Note

QC 20180212

Available from: 2018-02-12 Created: 2018-02-12 Last updated: 2025-02-18Bibliographically approved
Oertel, C., Gustafson, J. & Black, A. (2016). On Data Driven Parametric Backchannel Synthesis for Expressing Attentiveness in Conversational Agents. In: Proceedings of Multimodal Analyses enabling Artificial Agents in Human­-Machine Interaction (MA3HMI), satellite workshop of ICMI 2016: . Paper presented at Multimodal Analyses enabling Artificial Agents in Human­-Machine Interaction (MA3HMI), satellite workshop of ICMI 2016.
Open this publication in new window or tab >>On Data Driven Parametric Backchannel Synthesis for Expressing Attentiveness in Conversational Agents
2016 (English)In: Proceedings of Multimodal Analyses enabling Artificial Agents in Human­-Machine Interaction (MA3HMI), satellite workshop of ICMI 2016, 2016Conference paper, Published paper (Refereed)
Abstract [en]

In this study, we are using a multi-party recording as a template for building a parametric speech synthesiser which is able to express different levels of attentiveness in backchannel tokens. This allowed us to investigate i) whether it is possible to express the same perceived level of attentiveness in synthesised than in natural backchannels; ii) whether it is possible to increase and decrease the perceived level of attentiveness of backchannels beyond the range observed in the original corpus.

Keywords
Attentive agents, Backchannels, Synthesis
National Category
Computer Sciences Natural Language Processing
Identifiers
urn:nbn:se:kth:diva-198173 (URN)10.1145/3011263.3011272 (DOI)2-s2.0-85003674254 (Scopus ID)
Conference
Multimodal Analyses enabling Artificial Agents in Human­-Machine Interaction (MA3HMI), satellite workshop of ICMI 2016
Note

QC 20161214

Available from: 2016-12-13 Created: 2016-12-13 Last updated: 2025-02-01Bibliographically approved
Oertel, C., David Lopes, J., Yu, Y., Funes, K., Gustafson, J., Black, A. & Odobez, J.-M. (2016). Towards Building an Attentive Artificial Listener: On the Perception of Attentiveness in Audio-Visual Feedback Tokens. In: Proceedings of the 18th ACM International Conference on Multimodal Interaction (ICMI 2016): . Paper presented at the 18th ACM International Conference on Multimodal Interaction (ICMI 2016). Tokyo, Japan
Open this publication in new window or tab >>Towards Building an Attentive Artificial Listener: On the Perception of Attentiveness in Audio-Visual Feedback Tokens
Show others...
2016 (English)In: Proceedings of the 18th ACM International Conference on Multimodal Interaction (ICMI 2016), Tokyo, Japan, 2016Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
Tokyo, Japan: , 2016
National Category
Computer Sciences Natural Language Processing
Identifiers
urn:nbn:se:kth:diva-198171 (URN)10.1145/2993148.2993188 (DOI)000390299900007 ()2-s2.0-85016607242 (Scopus ID)
Conference
the 18th ACM International Conference on Multimodal Interaction (ICMI 2016)
Note

QC 20161214

Available from: 2016-12-13 Created: 2016-12-13 Last updated: 2025-02-01Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-8273-0132

Search in DiVA

Show all publications