Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Exploring Temporal Dependencies in Multimodal Referring Expressions with Mixed Reality
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL. Intelligent Robotics Research Group, Aalto University, Espoo, Finland.ORCID iD: 0000-0001-6738-9872
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0003-0579-3372
Show others and affiliations
2019 (English)In: Virtual, Augmented and Mixed Reality. Multimodal Interaction 11th International Conference, VAMR 2019, Held as Part of the 21st HCI International Conference, HCII 2019, Orlando, FL, USA, July 26–31, 2019, Proceedings, Springer Verlag , 2019, p. 108-123Conference paper, Published paper (Refereed)
Abstract [en]

In collaborative tasks, people rely both on verbal and non-verbal cues simultaneously to communicate with each other. For human-robot interaction to run smoothly and naturally, a robot should be equipped with the ability to robustly disambiguate referring expressions. In this work, we propose a model that can disambiguate multimodal fetching requests using modalities such as head movements, hand gestures, and speech. We analysed the acquired data from mixed reality experiments and formulated a hypothesis that modelling temporal dependencies of events in these three modalities increases the model’s predictive power. We evaluated our model on a Bayesian framework to interpret referring expressions with and without exploiting the temporal prior.

Place, publisher, year, edition, pages
Springer Verlag , 2019. p. 108-123
Series
Lecture Notes in Artificial Intelligence, ISSN 0302-9743 ; 11575
Keywords [en]
Human-robot interaction, Mixed reality, Multimodal interaction, Referring expressions, Human computer interaction, Human robot interaction, Bayesian frameworks, Collaborative tasks, Hand gesture, Head movements, Multi-modal, Multi-Modal Interactions, Predictive power
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
URN: urn:nbn:se:kth:diva-262467DOI: 10.1007/978-3-030-21565-1_8Scopus ID: 2-s2.0-85069730416ISBN: 9783030215644 (print)OAI: oai:DiVA.org:kth-262467DiVA, id: diva2:1361975
Conference
11th International Conference on Virtual, Augmented and Mixed Reality, VAMR 2019, held as part of the 21st International Conference on Human-Computer Interaction, HCI International 2019; Orlando; United States; 26 July 2019 through 31 July 2019
Note

QC 20191017

Available from: 2019-10-17 Created: 2019-10-17 Last updated: 2020-01-15Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopusConference

Authority records BETA

Sibirtseva, ElenaGhadirzadeh, AliLeite, IolandaBjörkman, MårtenKragic, Danica

Search in DiVA

By author/editor
Sibirtseva, ElenaGhadirzadeh, AliLeite, IolandaBjörkman, MårtenKragic, Danica
By organisation
Robotics, Perception and Learning, RPL
Computer Vision and Robotics (Autonomous Systems)

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 20 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf