kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
BT-ACTION: A Test-Driven Approach for Modular Understanding of User Instruction Leveraging Behaviour Trees and LLMs
KTH, School of Electrical Engineering and Computer Science (EECS), Robotics, Perception and Learning.
KTH, School of Electrical Engineering and Computer Science (EECS), Centres, Digital futures. KTH, School of Electrical Engineering and Computer Science (EECS), Robotics, Perception and Learning.ORCID iD: 0000-0002-2212-4325
KTH, School of Electrical Engineering and Computer Science (EECS), Robotics, Perception and Learning.ORCID iD: 0000-0002-1733-7019
2025 (English)In: 2025 34Th Ieee International Conference On Robot And Human Interactive Communication, Ro-Man, Institute of Electrical and Electronics Engineers (IEEE) , 2025, p. 878-884Conference paper, Published paper (Refereed)
Abstract [en]

Natural language instructions are often abstract and complex, requiring robots to execute multiple subtasks even for seemingly simple queries. For example, when a user asks a robot to prepare avocado toast, the task involves several sequential steps. Moreover, such instructions can be ambiguous or infeasible for the robot or may exceed the robot's existing knowledge. While Large Language Models (LLMs) offer strong language reasoning capabilities to handle these challenges, effectively integrating them into robotic systems remains a key challenge. To address this, we propose BT-ACTION, a test-driven approach that combines the modular structure of Behavior Trees (BT) with LLMs to generate coherent sequences of robot actions for following complex user instructions, specifically in the context of preparing recipes in a kitchen-assistance setting. We evaluated BT-ACTION in a comprehensive user study with 45 participants, comparing its performance to direct LLM prompting. Results demonstrate that the modular design of BT-ACTION helped the robot make fewer mistakes and increased user trust, and participants showed a significant preference for the robot leveraging the modular approach. The code is publicly available at https://github.com/1Eggbert7/BT_LLM.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE) , 2025. p. 878-884
Series
IEEE RO-MAN, ISSN 1944-9445
National Category
Robotics and automation
Identifiers
URN: urn:nbn:se:kth:diva-378960DOI: 10.1109/RO-MAN63969.2025.11217657ISI: 001672967200120Scopus ID: 2-s2.0-105024540475ISBN: 979-8-3315-8772-7 (print)ISBN: 979-8-3315-8771-0 (print)OAI: oai:DiVA.org:kth-378960DiVA, id: diva2:2050349
Conference
34th International Symposium on Robot and Human Interactive Communication-RO-MAN-Annual, AUG 25-29, 2025, Eindhoven, NETHERLANDS
Note

QC 20260401

Available from: 2026-04-01 Created: 2026-04-01 Last updated: 2026-04-01Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Leszczynski, AlexanderGillet, SarahLeite, IolandaDogan, Fethiye Irmak

Search in DiVA

By author/editor
Leszczynski, AlexanderGillet, SarahLeite, IolandaDogan, Fethiye Irmak
By organisation
Robotics, Perception and LearningDigital futures
Robotics and automation

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 7 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf