Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Safe-To-Explore State Spaces: Ensuring Safe Exploration in Policy Search with Hierarchical Task Optimization
Aalto Univ, Dept Elect Engn & Automat, Intelligent Robot Grp, Helsinki, Finland..
KTH, School of Electrical Engineering and Computer Science (EECS), Robotics, perception and learning, RPL.ORCID iD: 0000-0001-9603-1677
Orebro Univ, AASS Res Ctr, Orebro, Sweden..
Orebro Univ, AASS Res Ctr, Orebro, Sweden..
Show others and affiliations
2018 (English)In: 2018 IEEE-RAS 18TH INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS (HUMANOIDS) / [ed] Asfour, T, Institute of Electrical and Electronics Engineers (IEEE), 2018, p. 132-138Conference paper, Published paper (Refereed)
Abstract [en]

Policy search reinforcement learning allows robots to acquire skills by themselves. However, the learning procedure is inherently unsafe as the robot has no a-priori way to predict the consequences of the exploratory actions it takes. Therefore, exploration can lead to collisions with the potential to harm the robot and/or the environment. In this work we address the safety aspect by constraining the exploration to happen in safe-to-explore state spaces. These are formed by decomposing target skills (e.g., grasping) into higher ranked sub-tasks (e.g., collision avoidance, joint limit avoidance) and lower ranked movement tasks (e.g., reaching). Sub-tasks are defined as concurrent controllers (policies) in different operational spaces together with associated Jacobians representing their joint-space mapping. Safety is ensured by only learning policies corresponding to lower ranked sub-tasks in the redundant null space of higher ranked ones. As a side benefit, learning in sub-manifolds of the state-space also facilitates sample efficiency. Reaching skills performed in simulation and grasping skills performed on a real robot validate the usefulness of the proposed approach.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2018. p. 132-138
Series
IEEE-RAS International Conference on Humanoid Robots, ISSN 2164-0572
National Category
Robotics
Identifiers
URN: urn:nbn:se:kth:diva-245097DOI: 10.1109/HUMANOIDS.2018.8624948ISI: 000458689700019Scopus ID: 2-s2.0-85062286430ISBN: 978-1-5386-7283-9 (print)OAI: oai:DiVA.org:kth-245097DiVA, id: diva2:1294613
Conference
18th IEEE-RAS International Conference on Humanoid Robots (Humanoids), NOV 06-09, 2018, Beijing Inst Technol, Beijing, PEOPLES R CHINA
Note

QC 20190308

Available from: 2019-03-08 Created: 2019-03-08 Last updated: 2019-04-11Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records BETA

Krug, Robert

Search in DiVA

By author/editor
Krug, RobertKyrki, Ville
By organisation
Robotics, perception and learning, RPL
Robotics

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 47 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf