kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Enhanced Visual Scene Understanding through Human-Robot Dialog
KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP. KTH, School of Computer Science and Communication (CSC), Centres, Centre for Autonomous Systems, CAS.
KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP. KTH, School of Computer Science and Communication (CSC), Centres, Centre for Autonomous Systems, CAS.ORCID iD: 0000-0002-4921-7193
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.ORCID iD: 0000-0002-8579-1790
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.ORCID iD: 0000-0002-0397-6442
Show others and affiliations
2011 (English)In: 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, IEEE , 2011, p. 3342-3348Conference paper, Published paper (Refereed)
Abstract [en]

We propose a novel human-robot-interaction framework for robust visual scene understanding. Without any a-priori knowledge about the objects, the task of the robot is to correctly enumerate how many of them are in the scene and segment them from the background. Our approach builds on top of state-of-the-art computer vision methods, generating object hypotheses through segmentation. This process is combined with a natural dialog system, thus including a ‘human in the loop’ where, by exploiting the natural conversation of an advanced dialog system, the robot gains knowledge about ambiguous situations. We present an entropy-based system allowing the robot to detect the poorest object hypotheses and query the user for arbitration. Based on the information obtained from the human-robot dialog, the scene segmentation can be re-seeded and thereby improved. We present experimental results on real data that show an improved segmentation performance compared to segmentation without interaction.

Place, publisher, year, edition, pages
IEEE , 2011. p. 3342-3348
Series
Intelligent Robots and Systems (IROS), 2011 IEEE/RSJ International Conference on, ISSN 2153-0858
Keywords [en]
Service Robotics, Human-Robot Dialog, Machine Learning, Segmentation
National Category
Robotics and automation
Identifiers
URN: urn:nbn:se:kth:diva-46701DOI: 10.1109/IROS.2011.6048219ISI: 000297477503104Scopus ID: 2-s2.0-84455206614ISBN: 978-1-61284-454-1 (print)OAI: oai:DiVA.org:kth-46701DiVA, id: diva2:453961
Conference
International Conference on Intelligent Robots and Systems (IROS '11). San Francisco, CA, USA. 25 Sep - 30 Sep 2011
Projects
SavirGrasp
Funder
EU, FP7, Seventh Framework Programme, IST-FP7- IP-215821ICT - The Next Generation
Note
QC 20111118Available from: 2011-11-04 Created: 2011-11-04 Last updated: 2025-02-09Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopusIEEExplore

Authority records

Bohg, JeannetteSkantze, GabrielGustafsson, JoakimKragic, Danica

Search in DiVA

By author/editor
Johnson-Roberson, MatthewBohg, JeannetteSkantze, GabrielGustafsson, JoakimCarlson, RolfKragic, DanicaRasolzadeh, Babak
By organisation
Computer Vision and Active Perception, CVAPCentre for Autonomous Systems, CASSpeech, Music and Hearing, TMH
Robotics and automation

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 204 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf