Change search
Link to record
Permanent link

Direct link
BETA
Publications (10 of 41) Show all publications
Sibirtseva, E., Ghadirzadeh, A., Leite, I., Björkman, M. & Kragic, D. (2019). Exploring Temporal Dependencies in Multimodal Referring Expressions with Mixed Reality. In: Virtual, Augmented and Mixed Reality. Multimodal Interaction 11th International Conference, VAMR 2019, Held as Part of the 21st HCI International Conference, HCII 2019, Orlando, FL, USA, July 26–31, 2019, Proceedings: . Paper presented at 11th International Conference on Virtual, Augmented and Mixed Reality, VAMR 2019, held as part of the 21st International Conference on Human-Computer Interaction, HCI International 2019; Orlando; United States; 26 July 2019 through 31 July 2019 (pp. 108-123). Springer Verlag
Open this publication in new window or tab >>Exploring Temporal Dependencies in Multimodal Referring Expressions with Mixed Reality
Show others...
2019 (English)In: Virtual, Augmented and Mixed Reality. Multimodal Interaction 11th International Conference, VAMR 2019, Held as Part of the 21st HCI International Conference, HCII 2019, Orlando, FL, USA, July 26–31, 2019, Proceedings, Springer Verlag , 2019, p. 108-123Conference paper, Published paper (Refereed)
Abstract [en]

In collaborative tasks, people rely both on verbal and non-verbal cues simultaneously to communicate with each other. For human-robot interaction to run smoothly and naturally, a robot should be equipped with the ability to robustly disambiguate referring expressions. In this work, we propose a model that can disambiguate multimodal fetching requests using modalities such as head movements, hand gestures, and speech. We analysed the acquired data from mixed reality experiments and formulated a hypothesis that modelling temporal dependencies of events in these three modalities increases the model’s predictive power. We evaluated our model on a Bayesian framework to interpret referring expressions with and without exploiting the temporal prior.

Place, publisher, year, edition, pages
Springer Verlag, 2019
Series
Lecture Notes in Artificial Intelligence, ISSN 0302-9743 ; 11575
Keywords
Human-robot interaction, Mixed reality, Multimodal interaction, Referring expressions, Human computer interaction, Human robot interaction, Bayesian frameworks, Collaborative tasks, Hand gesture, Head movements, Multi-modal, Multi-Modal Interactions, Predictive power
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:kth:diva-262467 (URN)10.1007/978-3-030-21565-1_8 (DOI)2-s2.0-85069730416 (Scopus ID)9783030215644 (ISBN)
Conference
11th International Conference on Virtual, Augmented and Mixed Reality, VAMR 2019, held as part of the 21st International Conference on Human-Computer Interaction, HCI International 2019; Orlando; United States; 26 July 2019 through 31 July 2019
Note

QC 20191017

Available from: 2019-10-17 Created: 2019-10-17 Last updated: 2019-10-17Bibliographically approved
Ghadirzadeh, A., Bütepage, J., Maki, A., Kragic, D. & Björkman, M. (2016). A sensorimotor reinforcement learning framework for physical human-robot interaction. In: IEEE International Conference on Intelligent Robots and Systems: . Paper presented at 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2016, 9 October 2016 through 14 October 2016 (pp. 2682-2688). IEEE
Open this publication in new window or tab >>A sensorimotor reinforcement learning framework for physical human-robot interaction
Show others...
2016 (English)In: IEEE International Conference on Intelligent Robots and Systems, IEEE, 2016, p. 2682-2688Conference paper, Published paper (Refereed)
Abstract [en]

Modeling of physical human-robot collaborations is generally a challenging problem due to the unpredictive nature of human behavior. To address this issue, we present a data-efficient reinforcement learning framework which enables a robot to learn how to collaborate with a human partner. The robot learns the task from its own sensorimotor experiences in an unsupervised manner. The uncertainty in the interaction is modeled using Gaussian processes (GP) to implement a forward model and an actionvalue function. Optimal action selection given the uncertain GP model is ensured by Bayesian optimization. We apply the framework to a scenario in which a human and a PR2 robot jointly control the ball position on a plank based on vision and force/torque data. Our experimental results show the suitability of the proposed method in terms of fast and data-efficient model learning, optimal action selection under uncertainty and equal role sharing between the partners.

Place, publisher, year, edition, pages
IEEE, 2016
Keywords
Behavioral research, Intelligent robots, Reinforcement learning, Robots, Bayesian optimization, Forward modeling, Gaussian process, Human behaviors, Human-robot collaboration, Model learning, Optimal actions, Physical human-robot interactions, Human robot interaction
National Category
Robotics
Identifiers
urn:nbn:se:kth:diva-202121 (URN)10.1109/IROS.2016.7759417 (DOI)000391921702127 ()2-s2.0-85006367922 (Scopus ID)9781509037629 (ISBN)
Conference
2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2016, 9 October 2016 through 14 October 2016
Note

QC 20170228

Available from: 2017-02-28 Created: 2017-02-28 Last updated: 2019-08-16Bibliographically approved
Ghadirzadeh, A., Bütepage, J., Kragic, D. & Björkman, M. (2016). Self-learning and adaptation in a sensorimotor framework. In: Proceedings - IEEE International Conference on Robotics and Automation: . Paper presented at 2016 IEEE International Conference on Robotics and Automation, ICRA 2016, 16 May 2016 through 21 May 2016 (pp. 551-558). IEEE conference proceedings
Open this publication in new window or tab >>Self-learning and adaptation in a sensorimotor framework
2016 (English)In: Proceedings - IEEE International Conference on Robotics and Automation, IEEE conference proceedings, 2016, p. 551-558Conference paper, Published paper (Refereed)
Abstract [en]

We present a general framework to autonomously achieve the task of finding a sequence of actions that result in a desired state. Autonomy is acquired by learning sensorimotor patterns of a robot, while it is interacting with its environment. Gaussian processes (GP) with automatic relevance determination are used to learn the sensorimotor mapping. In this way, relevant sensory and motor components can be systematically found in high-dimensional sensory and motor spaces. We propose an incremental GP learning strategy, which discerns between situations, when an update or an adaptation must be implemented. The Rapidly exploring Random Tree (RRT∗) algorithm is exploited to enable long-term planning and generating a sequence of states that lead to a given goal; while a gradient-based search finds the optimum action to steer to a neighbouring state in a single time step. Our experimental results prove the suitability of the proposed framework to learn a joint space controller with high data dimensions (10×15). It demonstrates short training phase (less than 12 seconds), real-time performance and rapid adaptations capabilities.

Place, publisher, year, edition, pages
IEEE conference proceedings, 2016
National Category
Robotics
Identifiers
urn:nbn:se:kth:diva-197241 (URN)10.1109/ICRA.2016.7487178 (DOI)000389516200069 ()2-s2.0-84977498692 (Scopus ID)9781467380263 (ISBN)
Conference
2016 IEEE International Conference on Robotics and Automation, ICRA 2016, 16 May 2016 through 21 May 2016
Note

QC 20161207

Available from: 2016-12-07 Created: 2016-11-30 Last updated: 2019-08-16Bibliographically approved
Ghadirzadeh, A., Maki, A. & Björkman, M. (2015). A Sensorimotor Approach for Self-Learning of Hand-Eye Coordination. In: IEEE/RSJ International Conference onIntelligent Robots and Systems, Hamburg, September 28 - October 02, 2015: . Paper presented at Intelligent Robots and Systems (IROS),Hamburg, September 28 - October 02, 2015 (pp. 4969-4975). IEEE conference proceedings
Open this publication in new window or tab >>A Sensorimotor Approach for Self-Learning of Hand-Eye Coordination
2015 (English)In: IEEE/RSJ International Conference onIntelligent Robots and Systems, Hamburg, September 28 - October 02, 2015, IEEE conference proceedings, 2015, p. 4969-4975Conference paper, Published paper (Refereed)
Abstract [en]

This paper presents a sensorimotor contingencies (SMC) based method to fully autonomously learn to perform hand-eye coordination. We divide the task into two visuomotor subtasks, visual fixation and reaching, and implement these on a PR2 robot assuming no prior information on its kinematic model. Our contributions are three-fold: i) grounding a robot in the environment by exploiting SMCs in the action planning system, which eliminates the need for prior knowledge of the kinematic or dynamic models of the robot; ii) using a forward model to search for proper actions to solve the task by minimizing a cost function, instead of training a separate inverse model, to speed up training; iii) encoding 3D spatial positions of a target object based on the robot’s joint positions, thus avoiding calibration with respect to an external coordinate system. The method is capable of learning the task of hand-eye coordination from scratch by less than 20 sensory-motor pairs that are iteratively generated at real-time speed. In order to examine the robustness of the method while dealing with nonlinear image distortions, we apply a so-called retinal mapping image deformation to the input images. Experimental results show the successfulness of the method even under considerable image deformations.

Place, publisher, year, edition, pages
IEEE conference proceedings, 2015
Series
IEEE International Conference on Intelligent Robots and Systems, ISSN 2153-0858
Keywords
Reactive and Sensor-Based Planning, Robot Learning, Visual Servoing
National Category
Computer Vision and Robotics (Autonomous Systems)
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-179834 (URN)10.1109/IROS.2015.7354076 (DOI)000371885405012 ()2-s2.0-84958153652 (Scopus ID)9781479999941 (ISBN)
Conference
Intelligent Robots and Systems (IROS),Hamburg, September 28 - October 02, 2015
Projects
eSMCs
Note

Qc 20160212

Available from: 2015-12-29 Created: 2015-12-29 Last updated: 2018-05-21Bibliographically approved
Gratal, X., Smith, C., Björkman, M. & Kragic, D. (2015). Integrating 3D features and virtual visual servoing for hand-eye and humanoid robot pose estimation. In: IEEE-RAS International Conference on Humanoid Robots: . Paper presented at 2013 13th IEEE-RAS International Conference on Humanoid Robots, Humanoids 2013, 15 October 2013 through 17 October 2013 (pp. 240-245). IEEE Computer Society (February)
Open this publication in new window or tab >>Integrating 3D features and virtual visual servoing for hand-eye and humanoid robot pose estimation
2015 (English)In: IEEE-RAS International Conference on Humanoid Robots, IEEE Computer Society, 2015, no February, p. 240-245Conference paper, Published paper (Refereed)
Abstract [en]

In this paper, we propose an approach for vision-based pose estimation of a robot hand or full-body pose. The method is based on virtual visual servoing using a CAD model of the robot and it combines 2-D image features with depth features. The method can be applied to estimate either the pose of a robot hand or pose of the whole body given that its joint configuration is known. We present experimental results that show the performance of the approach as demonstrated on both a mobile humanoid robot and a stationary manipulator.

Place, publisher, year, edition, pages
IEEE Computer Society, 2015
Keywords
Anthropomorphic robots, Computer aided design, Robotic arms, Robots, Visual servoing, CAD modeling, Depth features, Humanoid robot, Image features, Joint configuration, Mobile humanoid robot, Pose estimation, Virtual visual servoing, Manipulators
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-175058 (URN)10.1109/HUMANOIDS.2013.7029982 (DOI)2-s2.0-84937947912 (Scopus ID)
Conference
2013 13th IEEE-RAS International Conference on Humanoid Robots, Humanoids 2013, 15 October 2013 through 17 October 2013
Note

QC 20151207

Available from: 2015-12-07 Created: 2015-10-09 Last updated: 2018-01-10Bibliographically approved
Lundberg, I., Björkman, M. & Ögren, P. (2015). Intrinsic camera and hand-eye calibration for a robot vision system using a point marker. In: IEEE-RAS International Conference on Humanoid Robots: . Paper presented at 2014 14th IEEE-RAS International Conference on Humanoid Robots, Humanoids 2014, 18 November 2014 through 20 November 2014 (pp. 59-66). IEEE Computer Society
Open this publication in new window or tab >>Intrinsic camera and hand-eye calibration for a robot vision system using a point marker
2015 (English)In: IEEE-RAS International Conference on Humanoid Robots, IEEE Computer Society, 2015, p. 59-66Conference paper, Published paper (Refereed)
Abstract [en]

Accurate robot camera calibration is a requirement for vision guided robots to perform precision assembly tasks. In this paper, we address the problem of doing intrinsic camera and hand-eye calibration on a robot vision system using a single point marker. This removes the need for using bulky special purpose calibration objects, and also facilitates on line accuracy checking and re-calibration when needed, without altering the robots production environment. The proposed solution provides a calibration routine that produces high quality results on par with the robot accuracy and completes a calibration in 3 minutes without need of manual intervention. We also present a method for automatic testing of camera calibration accuracy. Results from experimental verification on the dual arm concept robot FRIDA are presented.

Place, publisher, year, edition, pages
IEEE Computer Society, 2015
Keywords
Anthropomorphic robots, Automatic testing, Cameras, Computer vision, Robots, Camera calibration, Experimental verification, Hand-eye calibration, Manual intervention, Precision assemblies, Production environments, Robot vision systems, Vision-guided robots, Calibration
National Category
Computer Vision and Robotics (Autonomous Systems) Robotics
Identifiers
urn:nbn:se:kth:diva-181572 (URN)10.1109/HUMANOIDS.2014.7041338 (DOI)2-s2.0-84945196329 (Scopus ID)9781479971749 (ISBN)
Conference
2014 14th IEEE-RAS International Conference on Humanoid Robots, Humanoids 2014, 18 November 2014 through 20 November 2014
Note

QC 20160314

Available from: 2016-03-14 Created: 2016-02-02 Last updated: 2018-01-10Bibliographically approved
Björkman, M., Bergström, N. & Kragic, D. (2014). Detecting, segmenting and tracking unknown objects using multi-label MRF inference. Computer Vision and Image Understanding, 118, 111-127
Open this publication in new window or tab >>Detecting, segmenting and tracking unknown objects using multi-label MRF inference
2014 (English)In: Computer Vision and Image Understanding, ISSN 1077-3142, E-ISSN 1090-235X, Vol. 118, p. 111-127Article in journal (Refereed) Published
Abstract [en]

This article presents a unified framework for detecting, segmenting and tracking unknown objects in everyday scenes, allowing for inspection of object hypotheses during interaction over time. A heterogeneous scene representation is proposed, with background regions modeled as a combinations of planar surfaces and uniform clutter, and foreground objects as 3D ellipsoids. Recent energy minimization methods based on loopy belief propagation, tree-reweighted message passing and graph cuts are studied for the purpose of multi-object segmentation and benchmarked in terms of segmentation quality, as well as computational speed and how easily methods can be adapted for parallel processing. One conclusion is that the choice of energy minimization method is less important than the way scenes are modeled. Proximities are more valuable for segmentation than similarity in colors, while the benefit of 3D information is limited. It is also shown through practical experiments that, with implementations on GPUs, multi-object segmentation and tracking using state-of-art MRF inference methods is feasible, despite the computational costs typically associated with such methods.

Place, publisher, year, edition, pages
Elsevier, 2014
Keywords
Figure-ground segmentation, Active perception, MRF, Multi-object tracking, Object detection, GPU acceleration
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:kth:diva-133215 (URN)10.1016/j.cviu.2013.10.007 (DOI)000328591500011 ()2-s2.0-84890998700 (Scopus ID)
Note

QC 20140122. Updated from accepted to published.

Available from: 2013-10-28 Created: 2013-10-28 Last updated: 2018-01-11Bibliographically approved
Pokorny, F. T., Bekiroglu, Y., Björkman, M., Exner, J. & Kragic, D. (2014). Grasp Moduli Spaces, Gaussian Processes and Multimodal Sensor Data. In: : . Paper presented at RSS 2014 Workshop: Information-based Grasp and Manipulation Planning, July 13, 2014, Berkeley, California.
Open this publication in new window or tab >>Grasp Moduli Spaces, Gaussian Processes and Multimodal Sensor Data
Show others...
2014 (English)Conference paper, Poster (with or without abstract) (Refereed)
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:kth:diva-165762 (URN)
Conference
RSS 2014 Workshop: Information-based Grasp and Manipulation Planning, July 13, 2014, Berkeley, California
Note

QC 20150506

Available from: 2015-04-29 Created: 2015-04-29 Last updated: 2018-01-11Bibliographically approved
Björkman, M. & Bekiroglu, Y. (2014). Learning to Disambiguate Object Hypotheses through Self-Exploration. In: 14th IEEE-RAS International Conference onHumanoid Robots: . Paper presented at 14th IEEE-RAS International Conference on Humanoid Robots (Humanoids) November 18-20, 2014. Madrid, Spain. IEEE Computer Society
Open this publication in new window or tab >>Learning to Disambiguate Object Hypotheses through Self-Exploration
2014 (English)In: 14th IEEE-RAS International Conference onHumanoid Robots, IEEE Computer Society, 2014Conference paper, Published paper (Refereed)
Abstract [en]

We present a probabilistic learning framework to form object hypotheses through interaction with the environment. A robot learns how to manipulate objects through pushing actions to identify how many objects are present in the scene. We use a segmentation system that initializes object hypotheses based on RGBD data and adopt a reinforcement approach to learn the relations between pushing actions and their effects on object segmentations. Trained models are used to generate actions that result in minimum number of pushes on object groups, until either object separation events are observed or it is ensured that there is only one object acted on. We provide baseline experiments that show that a policy based on reinforcement learning for action selection results in fewer pushes, than if pushing actions were selected randomly.

Place, publisher, year, edition, pages
IEEE Computer Society, 2014
Keywords
Anthropomorphic Robots, Reinforcement Learning, Action Selection, Object Groups, Object Segmentation, Object Separation, Policy-Based, Probabilistic Learning, Segmentation System
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:kth:diva-165630 (URN)10.1109/HUMANOIDS.2014.7041418 (DOI)2-s2.0-84945190036 (Scopus ID)978-147997174-9 (ISBN)
Conference
14th IEEE-RAS International Conference on Humanoid Robots (Humanoids) November 18-20, 2014. Madrid, Spain
Note

QC 20150608

QC 20160203

Available from: 2015-04-29 Created: 2015-04-29 Last updated: 2018-01-11Bibliographically approved
Ghadirzadeh, A., Kootstra, G., Maki, A. & Björkman, M. (2014). Learning visual forward models to compensate for self-induced image motion. In: 23rd IEEE International Conference on Robot and Human Interactive Communication: IEEE RO-MAN. Paper presented at 23rd IEEE International Conference on Robot and Human Interactive Communication : IEEE RO-MAN : August 25-29, 2014, Edinburgh, Scotland, UK (pp. 1110-1115). IEEE
Open this publication in new window or tab >>Learning visual forward models to compensate for self-induced image motion
2014 (English)In: 23rd IEEE International Conference on Robot and Human Interactive Communication: IEEE RO-MAN, IEEE , 2014, p. 1110-1115Conference paper, Published paper (Refereed)
Abstract [en]

Predicting the sensory consequences of an agent's own actions is considered an important skill for intelligent behavior. In terms of vision, so-called visual forward models can be applied to learn such predictions. This is no trivial task given the high-dimensionality of sensory data and complex action spaces. In this work, we propose to learn the visual consequences of changes in pan and tilt of a robotic head using a visual forward model based on Gaussian processes and SURF correspondences. This is done without any assumptions on the kinematics of the system or requirements on calibration. The proposed method is compared to an earlier work using accumulator-based correspondences and Radial Basis function networks. We also show the feasibility of the proposed method for detection of independent motion using a moving camera system. By comparing the predicted and actual captured images, image motion due to the robot's own actions and motion caused by moving external objects can be distinguished. Results show the proposed method to be preferable from the earlier method in terms of both prediction errors and ability to detect independent motion.

Place, publisher, year, edition, pages
IEEE, 2014
National Category
Robotics
Identifiers
urn:nbn:se:kth:diva-158120 (URN)10.1109/ROMAN.2014.6926400 (DOI)2-s2.0-84937605949 (Scopus ID)978-1-4799-6763-6 (ISBN)
Conference
23rd IEEE International Conference on Robot and Human Interactive Communication : IEEE RO-MAN : August 25-29, 2014, Edinburgh, Scotland, UK
Note

QC 20150407

Available from: 2014-12-22 Created: 2014-12-22 Last updated: 2018-05-21Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0003-0579-3372

Search in DiVA

Show all publications