kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
VARIQuery: VAE Segment-based Active Learning for Query Selection in Preference-based Reinforcement Learning
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0002-3510-5481
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0001-7461-920X
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0003-4173-2593
Show others and affiliations
2023 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Human-in-the-loop reinforcement learning (RL) methods actively integrate human knowledge to create reward functions for various robotic tasks. Learning from preferences shows promise as alleviates the requirement of demonstrations by querying humans on state-action sequences. However, the limited granularity of sequence-based approaches complicates temporal credit assignment. The amount of human querying is contingent on query quality, as redundant queries result in excessive human involvement. This paper addresses the often-overlooked aspect of query selection, which is closely related to active learning (AL). We propose a novel query selection approach that leverages variational autoencoder (VAE) representations of state sequences. In this manner, we formulate queries that are diverse in nature while simultaneously taking into account reward model estimations. We compare our approach to the current state-of-the-art query selection methods in preference-based RL, and find ours to be either on-par or more sample efficient through extensive benchmarking on simulated environments relevant to robotics. Lastly, we conduct an online study to verify the effectiveness of our query selection approach with real human feedback and examine several metrics related to human effort.

Place, publisher, year, edition, pages
2023.
National Category
Robotics and automation
Identifiers
URN: urn:nbn:se:kth:diva-333948OAI: oai:DiVA.org:kth-333948DiVA, id: diva2:1787953
Conference
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2023, October 1 – 5, 2023, Detroit, Michigan, USA.
Note

QC 20230818

Available from: 2023-08-15 Created: 2023-08-15 Last updated: 2025-02-09Bibliographically approved

Open Access in DiVA

fulltext(1816 kB)504 downloads
File information
File name FULLTEXT01.pdfFile size 1816 kBChecksum SHA-512
dede4d1ebe81623ee4801d5d6962a5e457bedf35c0b12234c6a868190d065a109625b18cc53271b5501ad52ff230b3c9ec6fcc97e8137fcc315c69f882c8bc13
Type fulltextMimetype application/pdf

Other links

Conference

Authority records

Marta, DanielHolk, SimonPek, ChristianTumova, JanaLeite, Iolanda

Search in DiVA

By author/editor
Marta, DanielHolk, SimonPek, ChristianTumova, JanaLeite, Iolanda
By organisation
Robotics, Perception and Learning, RPL
Robotics and automation

Search outside of DiVA

GoogleGoogle Scholar
Total: 511 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 692 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf