kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A dual-control dialogue framework for human-robot interaction data collection: integrating human emotional and contextual awareness with conversational AI
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.ORCID iD: 0000-0002-1001-6415
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.ORCID iD: 0000-0003-1399-6604
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.ORCID iD: 0000-0002-0397-6442
2024 (English)In: International Conference of Social Robotics (ICSR 2024), 2024Conference paper, Poster (with or without abstract) (Refereed)
Abstract [en]

This paper presents a dialogue framework designed to capture human-robot interactions enriched with human-level situational awareness. The system integrates advanced large language models with realtime human-in-the-loop control. Central to this framework is an interaction manager that oversees information flow, turn-taking, and prosody control of a social robot’s responses. A key innovation is the control interface, enabling a human operator to perform tasks such as emotion recognition and action detection through a live video feed. The operator also manages high-level tasks, like topic shifts or behaviour instructions.

Input from the operator is incorporated into the dialogue context managed by GPT-4o, thereby influencing the ongoing interaction. This allows for the collection of interactional data from an automated system that leverages human-level emotional and situational awareness. The audiovisual data will be used to explore the impact of situational awareness on user behaviors in task-oriented human-robot interaction.

Place, publisher, year, edition, pages
2024.
National Category
Natural Language Processing
Research subject
Speech and Music Communication
Identifiers
URN: urn:nbn:se:kth:diva-375300OAI: oai:DiVA.org:kth-375300DiVA, id: diva2:2027039
Conference
International Conference of Social Robotics (ICSR 2024), Odense, Denmark, 24-26 October, 2024
Note

QC 20260112

Available from: 2026-01-12 Created: 2026-01-12 Last updated: 2026-01-12Bibliographically approved

Open Access in DiVA

fulltext(1782 kB)23 downloads
File information
File name FULLTEXT01.pdfFile size 1782 kBChecksum SHA-512
1cd2e6ba5e0fd30ae0ddd90deef0b4dbd3a7de9236081aa8964e45b2a600cabf2b14d937208dff9fb7fcdf013d4d7467149855f2a20b541b6990756e27e69405
Type fulltextMimetype application/pdf

Authority records

Marcinek, LubosBeskow, JonasGustafsson, Joakim

Search in DiVA

By author/editor
Marcinek, LubosBeskow, JonasGustafsson, Joakim
By organisation
Speech, Music and Hearing, TMH
Natural Language Processing

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 6118 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf