kth.sePublications KTH
Operational message
There are currently operational disruptions. Troubleshooting is in progress.
Change search
Link to record
Permanent link

Direct link
Kragic Jensfelt, DanicaORCID iD iconorcid.org/0000-0003-2965-2953
Alternative names
Publications (10 of 460) Show all publications
Khanna, P., Rajabi, N., Kanik, S. U. e., Kragic Jensfelt, D., Björkman, M. & Smith, C. (2026). Early detection of human handover intentions in human–robot collaboration: Comparing EEG, gaze, and hand motion. Robotics and Autonomous Systems, 196, Article ID 105244.
Open this publication in new window or tab >>Early detection of human handover intentions in human–robot collaboration: Comparing EEG, gaze, and hand motion
Show others...
2026 (English)In: Robotics and Autonomous Systems, ISSN 0921-8890, E-ISSN 1872-793X, Vol. 196, article id 105244Article in journal (Refereed) Published
Abstract [en]

Human–robot collaboration (HRC) relies on accurate and timely recognition of human intentions to ensure seamless interactions. Among common HRC tasks, human-to-robot object handovers have been studied extensively for planning the robot's actions during object reception, assuming the human intention for object handover. However, distinguishing handover intentions from other actions has received limited attention. Most research on handovers has focused on visually detecting motion trajectories, which often results in delays or false detections when trajectories overlap. This paper investigates whether human intentions for object handovers are reflected in non-movement-based physiological signals. We conduct a multimodal analysis comparing three data modalities: electroencephalogram (EEG), gaze, and hand-motion signals. Our study aims to distinguish between handover-intended human motions and non-handover motions in an HRC setting, evaluating each modality's performance in predicting and classifying these actions before and after human movement initiation. We develop and evaluate human intention detectors based on these modalities, comparing their accuracy and timing in identifying handover intentions. To the best of our knowledge, this is the first study to systematically develop and test intention detectors across multiple modalities within the same experimental context of human–robot handovers. Our analysis reveals that handover intention can be detected from all three modalities. Nevertheless, gaze signals are the earliest as well as the most accurate to classify the motion as intended for handover or non-handover.

Place, publisher, year, edition, pages
Elsevier BV, 2026
Keywords
EEG, Gaze, Human–robot collaboration (HRC), Human–robot handovers, Motion analysis
National Category
Robotics and automation
Identifiers
urn:nbn:se:kth:diva-373139 (URN)10.1016/j.robot.2025.105244 (DOI)2-s2.0-105021346666 (Scopus ID)
Note

QC 20251121

Available from: 2025-11-21 Created: 2025-11-21 Last updated: 2025-11-21Bibliographically approved
Billard, A., Albu-Schaeffer, A., Beetz, M., Burgard, W., Corke, P., Ciocarlie, M., . . . Scaramuzza, D. (2025). A roadmap for AI in robotics. Nature Machine Intelligence, 7(6), 818-824
Open this publication in new window or tab >>A roadmap for AI in robotics
Show others...
2025 (English)In: Nature Machine Intelligence, E-ISSN 2522-5839, Vol. 7, no 6, p. 818-824Article in journal (Refereed) Published
Abstract [en]

There is growing excitement about the potential of leveraging artificial intelligence (AI) to tackle some of the outstanding barriers to the full deployment of robots in daily lives. However, action and sensing in the physical world pose greater and different challenges for AI than analysing data in isolation and it is important to reflect on which AI approaches are most likely to be successfully applied to robots. Questions to address, among others, are how AI models can be adapted to specific robot designs, tasks and environments. This Perspective offers an assessment of what AI has achieved for robotics since the 1990s and proposes a research roadmap with challenges and promises. These range from keeping up-to-date large datasets, representatives of a diversity of tasks that robots may have to perform, and of environments they may encounter, to designing AI algorithms tailored specifically to robotics problems but generic enough to apply to a wide range of applications and transfer easily to a variety of robotic platforms. For robots to collaborate effectively with humans, they must predict human behaviour without relying on bias-based profiling. Explainability and transparency in AI-driven robot control are essential for building trust, preventing misuse and attributing responsibility in accidents. We close with describing what are, in our view, primary long-term challenges, namely, designing robots capable of lifelong learning, and guaranteeing safe deployment and usage, as well as sustainable development.

Place, publisher, year, edition, pages
Springer Nature, 2025
National Category
Robotics and automation Computer graphics and computer vision Other Engineering and Technologies Computer Sciences
Identifiers
urn:nbn:se:kth:diva-368676 (URN)10.1038/s42256-025-01050-6 (DOI)001511728800001 ()2-s2.0-105008638330 (Scopus ID)
Note

QC 20250821

Available from: 2025-08-21 Created: 2025-08-21 Last updated: 2025-09-08Bibliographically approved
Ma, Y., Zhang, Y., Fu, D., Portales, S. Z., Kragic Jensfelt, D. & Fjeld, M. (2025). Advancing User-Voice Interaction: Exploring Emotion-Aware Voice Assistants Through a Role-Swapping Approach. In: Konomi, S Streitz, NA (Ed.), Distributed, Ambient And Pervasive Interactions, Dapi 2025, Pt I: . Paper presented at 13th International Conference on Distributed Ambient and Pervasive Interactions-DAPI, jun 22-27, 2025, Gothenburg, Sweden (pp. 303-320). Springer Nature, 15802
Open this publication in new window or tab >>Advancing User-Voice Interaction: Exploring Emotion-Aware Voice Assistants Through a Role-Swapping Approach
Show others...
2025 (English)In: Distributed, Ambient And Pervasive Interactions, Dapi 2025, Pt I / [ed] Konomi, S Streitz, NA, Springer Nature , 2025, Vol. 15802, p. 303-320Conference paper, Published paper (Refereed)
Abstract [en]

As voice assistants (VAs) become increasingly integrated into daily life, the need for emotion-aware systems that can recognize and respond appropriately to user emotions has grown. While significant progress has been made in speech emotion recognition (SER) and sentiment analysis, effectively addressing user emotions-particularly negative ones-remains a challenge. This study explores human emotional response strategies in VA interactions using a role-swapping approach, where participants regulate AI emotions rather than receiving pre-programmed responses. Through speech feature analysis and natural language processing (NLP), we examined acoustic and linguistic patterns across various emotional scenarios. Results show that participants favor neutral or positive emotional responses when engaging with negative emotional cues, highlighting a natural tendency toward emotional regulation and de-escalation. Key acoustic indicators such as root mean square (RMS), zero-crossing rate (ZCR), and jitter were identified as sensitive to emotional states, while sentiment polarity and lexical diversity (TTR) distinguished between positive and negative responses. These findings provide valuable insights for developing adaptive, context-aware VAs capable of delivering empathetic, culturally sensitive, and user-aligned responses. By understanding how humans naturally regulate emotions in AI interactions, this research contributes to the design of more intuitive and emotionally intelligent voice assistants, enhancing user trust and engagement in human-AI interactions.

Place, publisher, year, edition, pages
Springer Nature, 2025
Series
Lecture Notes in Computer Science, ISSN 0302-9743
Keywords
Emotion-Aware Voice Assistants, Role-Swapping Approach, Speech and Linguistic Analysis, Speech Emotion Recognition (SER)
National Category
Comparative Language Studies and Linguistics
Identifiers
urn:nbn:se:kth:diva-374167 (URN)10.1007/978-3-031-92977-9_19 (DOI)001551861000019 ()2-s2.0-105007671581 (Scopus ID)
Conference
13th International Conference on Distributed Ambient and Pervasive Interactions-DAPI, jun 22-27, 2025, Gothenburg, Sweden
Note

Part of ISBN 978-3-031-92976-2; 978-3-031-92977-9

QC 20251216

Available from: 2025-12-16 Created: 2025-12-16 Last updated: 2025-12-16Bibliographically approved
Ceylan, C., Ghoorchian, K. & Kragic Jensfelt, D. (2025). Disobeying Directions: Switching Random Walk Filters for Unsupervised Node Embedding Learning on Directed Graphs. Transactions on Machine Learning Research, 2025-June
Open this publication in new window or tab >>Disobeying Directions: Switching Random Walk Filters for Unsupervised Node Embedding Learning on Directed Graphs
2025 (English)In: Transactions on Machine Learning Research, E-ISSN 2835-8856, Vol. 2025-JuneArticle in journal (Refereed) Published
Abstract [en]

Unsupervised learning of node embeddings for directed graphs (digraphs) requires careful handling to ensure unbiased modelling. This paper addresses two key challenges: (1) the obstruction of information propagation in random walk and message-passing methods due to local sinks, and (2) the representation of multiple multi-step directed neighbourhoods, arising from the distinction between in-and out-neighbours. These challenges are interconnected—local sinks can be mitigated by treating the graph as undirected, but this comes at the cost of discarding all directional information. We make two main contributions to unsupervised embedding learning for digraphs. First, we introduce ReachNEs (Reachability Node Embeddings), a general framework for analysing embedding models and diagnosing local sink behaviour on digraphs. ReachNEs defines the reachability filter, a matrix polynomial over normalized adjacency matrices that captures multi-step, direction-sensitive proximity. It unifies the analysis of message-passing and random walk models, making its insights applicable across a wide range of embedding methods. Second, we propose DirSwitch, a novel embedding model that resolves both local sink bias and neighbourhood multiplicity via switching random walks. These walks use directed edges for local steps, preserving directional structure, then switch to undirected edges for long-range transitions, enabling escape from local sinks and improving information dispersal. Empirical results on node classification benchmarks demonstrate that DirSwitch consistently outperforms state-of-the-art unsupervised digraph proximity embedding methods, and also serves as a flexible digraph extension for self-supervised graph neural networks.

Place, publisher, year, edition, pages
Transactions on Machine Learning Research, 2025
National Category
Computer Sciences Probability Theory and Statistics
Identifiers
urn:nbn:se:kth:diva-368840 (URN)2-s2.0-105009417431 (Scopus ID)
Note

QC 20250902

Available from: 2025-09-02 Created: 2025-09-02 Last updated: 2025-09-02Bibliographically approved
Ceylan, C., Ghoorchian, K. & Kragic Jensfelt, D. (2025). Disobeying Directions: Switching Random Walk Filters for Unsupervised Node Embedding Learning on Directed Graphs. Transactions on Machine Learning Research
Open this publication in new window or tab >>Disobeying Directions: Switching Random Walk Filters for Unsupervised Node Embedding Learning on Directed Graphs
2025 (English)In: Transactions on Machine Learning Research, E-ISSN 2835-8856Article in journal (Refereed) Epub ahead of print
Keywords
directed graphs, node embeddings, unsupervised learning, random walks
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-366840 (URN)
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

QC 20250806

Available from: 2025-07-18 Created: 2025-07-18 Last updated: 2026-01-08Bibliographically approved
Wang, R., Zhuang, Z., Jin, S., Ingelhag, N., Kragic Jensfelt, D. & Pokorny, F. T. (2025). Feature Extractor or Decision Maker: Rethinking the Role of Visual Encoders in Visuomotor Policies. In: 2025 IEEE International Conference on Robotics and Automation, ICRA 2025: . Paper presented at 2025 IEEE International Conference on Robotics and Automation, ICRA 2025, Atlanta, United States of America, May 19 2025 - May 23 2025 (pp. 3654-3661). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Feature Extractor or Decision Maker: Rethinking the Role of Visual Encoders in Visuomotor Policies
Show others...
2025 (English)In: 2025 IEEE International Conference on Robotics and Automation, ICRA 2025, Institute of Electrical and Electronics Engineers (IEEE) , 2025, p. 3654-3661Conference paper, Published paper (Refereed)
Abstract [en]

An end-to-end (E2E) visuomotor policy is typically treated as a unified whole, but recent approaches using out-of-domain (OOD) data to pretrain the visual encoder have cleanly separated the visual encoder from the network, with the remainder referred to as the policy. We propose Visual Alignment Testing, an experimental framework designed to evaluate the validity of this functional separation. Our results indicate that in E2E-trained models, visual encoders actively contribute to decision-making resulting from motor data supervision, contradicting the assumed functional separation. In contrast, OOD-pretrained models, where encoders lack this capability, experience an average performance drop of 42% in our benchmark results, compared to the state-of-the-art performance achieved by E2E policies. We believe this initial exploration of visual encoders' role can provide a first step towards guiding future pretraining methods to address their decision-making ability, such as developing task-conditioned or context-aware encoders.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2025
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-371368 (URN)10.1109/ICRA55743.2025.11127332 (DOI)001582497400330 ()2-s2.0-105016697318 (Scopus ID)
Conference
2025 IEEE International Conference on Robotics and Automation, ICRA 2025, Atlanta, United States of America, May 19 2025 - May 23 2025
Note

Part of ISBN 9798331541392

QC 20251014

Available from: 2025-10-14 Created: 2025-10-14 Last updated: 2026-02-04Bibliographically approved
Marta, D., Holk, S., Vasco, M., Lundell, J., Homberger, T., Busch, F. L., . . . Leite, I. (2025). FLoRA: Sample-Efficient Preference-based RL via Low-Rank Style Adaptation of Reward Functions. In: IEEE International Conference on Robotics and Automation: . Paper presented at IEEE International Conference on Robotics and Automation, ICRA 2025, Atlanta, USA, 19-23 May 2025 (pp. 4789-4796). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>FLoRA: Sample-Efficient Preference-based RL via Low-Rank Style Adaptation of Reward Functions
Show others...
2025 (English)In: IEEE International Conference on Robotics and Automation, Institute of Electrical and Electronics Engineers (IEEE), 2025, p. 4789-4796Conference paper, Published paper (Refereed)
Abstract [en]

Preference-based reinforcement learning (PbRL) is a suitable approach for style adaptation of pre-trained robotic behavior: adapting the robot's policy to follow human user preferences while still being able to perform the original task. However, collecting preferences for the adaptation process in robotics is often challenging and time-consuming. In this work we explore the adaptation of pre-trained robots in the low-preference-data regime. We show that, in this regime, recent adaptation approaches suffer from catastrophic reward forgetting (CRF), where the updated reward model overfits to the new preferences, leading the agent to become unable to perform the original task. To mitigate CRF, we propose to enhance the original reward model with a small number of parameters (low-rank matrices) responsible for modeling the preference adaptation. Our evaluation shows that our method can efficiently and effectively adjust robotic behavior to human preferences across simulation benchmark tasks and multiple real-world robotic tasks. We provide videos of our results and source code at https://sites.google.com/view/preflora/

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2025
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-360980 (URN)10.1109/ICRA55743.2025.11127633 (DOI)2-s2.0-105016684037 (Scopus ID)
Conference
IEEE International Conference on Robotics and Automation, ICRA 2025, Atlanta, USA, 19-23 May 2025
Note

QC 20250618

Part of ISBN 979-833154139-2

Available from: 2025-03-07 Created: 2025-03-07 Last updated: 2025-10-14Bibliographically approved
Ceylan, C., Ghoorchian, K. & Kragic, D. (2025). Full-Rank Unsupervised Node Embeddings for Directed Graphs via Message Aggregation. Transactions on Machine Learning Research, 5
Open this publication in new window or tab >>Full-Rank Unsupervised Node Embeddings for Directed Graphs via Message Aggregation
2025 (English)In: Transactions on Machine Learning Research, E-ISSN 2835-8856, Vol. 5Article in journal (Refereed) Published
Abstract [en]

Linear message-passing models have emerged as compelling alternatives to non-linear graph neural networks for unsupervised node embedding learning, due to their scalability and competitive performance on downstream tasks. However, we identify a fundamental flaw in recently proposed linear models that combine embedding aggregation with concatenation during each message-passing iteration: rank deficiency. A rank-deficient embedding matrix contains column vectors which take arbitrary values, leading to ill-conditioning that degrades downstream task accuracy, particularly in unsupervised tasks such as graph alignment. We deduce that repeated embedding aggregation and concatenation introduces linearly dependent features, causing rank deficiency. To address this, we propose ACC (Aggregate, Compress, Concatenate), a novel model that avoids redundant feature computation by applying aggregation to the messages from the previous iteration, rather than the embeddings. Consequently, ACC generates full-rank embeddings, significantly improving graph alignment accuracy from 10% to 60% compared to rank-deficient embeddings, while also being faster to compute. Additionally, ACC employs directed message-passing and achieves node classification accuracies comparable to state-of-the-art self-supervised graph neural networks on directed graph benchmarks, while also being over 70 times faster on graphs with over 1 million edges.

National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-365164 (URN)
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

QC 20250630

Available from: 2025-06-19 Created: 2025-06-19 Last updated: 2025-06-30Bibliographically approved
Ceylan, C., Ghoorchian, K. & Kragic Jensfelt, D. (2025). Full-Rank Unsupervised Node Embeddings for Directed Graphs via Message Aggregation. Transactions on Machine Learning Research, 2025-June
Open this publication in new window or tab >>Full-Rank Unsupervised Node Embeddings for Directed Graphs via Message Aggregation
2025 (English)In: Transactions on Machine Learning Research, E-ISSN 2835-8856, Vol. 2025-JuneArticle in journal (Refereed) Published
Abstract [en]

Linear message-passing models have emerged as compelling alternatives to non-linear graph neural networks for unsupervised node embedding learning, due to their scalability and competitive performance on downstream tasks. However, we identify a fundamental flaw in recently proposed linear models that combine embedding aggregation with concatenation during each message-passing iteration: rank deficiency. A rank-deficient embedding matrix contains column vectors which take arbitrary values, leading to ill-conditioning that degrades downstream task accuracy, particularly in unsupervised tasks such as graph alignment. We deduce that repeated embedding aggregation and concatenation introduces linearly dependent features, causing rank deficiency. To address this, we propose ACC (Aggregate, Compress, Concatenate), a novel model that avoids redundant feature computation by applying aggregation to the messages from the previous iteration, rather than the embeddings. Consequently, ACC generates full-rank embeddings, significantly improving graph alignment accuracy from 10% to 60% compared to rank-deficient embeddings, while also being faster to compute. Additionally, ACC employs directed message-passing and achieves node classification accuracies comparable to state-of-the-art self-supervised graph neural networks on directed graph benchmarks, while also being over 70 times faster on graphs with over 1 million edges.

Place, publisher, year, edition, pages
Transactions on Machine Learning Research, 2025
National Category
Computer Sciences Communication Systems
Identifiers
urn:nbn:se:kth:diva-366571 (URN)2-s2.0-105007994859 (Scopus ID)
Note

QC 20250710

Available from: 2025-07-10 Created: 2025-07-10 Last updated: 2025-07-10Bibliographically approved
Lu, H., Dong, Y., Weng, Z., Pokorny, F. T., Lundell, J. & Kragic Jensfelt, D. (2025). Grasping a Handful: Sequential Multi-Object Dexterous Grasp Generation. IEEE Robotics and Automation Letters, 10(11), 11880-11887
Open this publication in new window or tab >>Grasping a Handful: Sequential Multi-Object Dexterous Grasp Generation
Show others...
2025 (English)In: IEEE Robotics and Automation Letters, E-ISSN 2377-3766, Vol. 10, no 11, p. 11880-11887Article in journal (Refereed) Published
Abstract [en]

We introduce the sequential multi-object robotic grasp sampling algorithm SeqGrasp that can robustly synthesize stable grasps on diverse objects using the robotic hand’s partial Degrees of Freedom (DoF). We use SeqGrasp to construct the large-scale Allegro Hand sequential grasping dataset SeqDataset and use it for training the diffusion-based sequential grasp generator SeqDiffuser. We experimentally evaluate SeqGrasp and SeqDiffuser against the state-of-the-art non-sequential multi-object grasp generation method MultiGrasp in simulation and on a real robot. The experimental results demonstrate that SeqGrasp and SeqDiffuser reach an 8.71%-43.33% higher grasp success rate than MultiGrasp. Furthermore, SeqDiffuser is approximately 1000 times faster at generating grasps than SeqGrasp and MultiGrasp. Project page: https://yulihn.github.io/SeqGrasp/.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2025
Keywords
Data Sets for Robot Learning, Dexterous Manipulation, Grasping
National Category
Computer graphics and computer vision Robotics and automation Computer Sciences
Identifiers
urn:nbn:se:kth:diva-371629 (URN)10.1109/LRA.2025.3614051 (DOI)001594944700028 ()2-s2.0-105017444167 (Scopus ID)
Note

QC 20251017

Available from: 2025-10-17 Created: 2025-10-17 Last updated: 2025-12-05Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0003-2965-2953

Search in DiVA

Show all publications