kth.sePublications
Change search
Link to record
Permanent link

Direct link
Publications (10 of 102) Show all publications
Zhu, X., Mårtensson, P., Hanson, L., Björkman, M. & Maki, A. (2025). Automated assembly quality inspection by deep learning with 2D and 3D synthetic CAD data. Journal of Intelligent Manufacturing, 36(4), 2567-2582, Article ID e222.
Open this publication in new window or tab >>Automated assembly quality inspection by deep learning with 2D and 3D synthetic CAD data
Show others...
2025 (English)In: Journal of Intelligent Manufacturing, ISSN 0956-5515, E-ISSN 1572-8145, Vol. 36, no 4, p. 2567-2582, article id e222Article in journal (Refereed) Published
Abstract [en]

In the manufacturing industry, automatic quality inspections can lead to improved product quality and productivity. Deep learning-based computer vision technologies, with their superior performance in many applications, can be a possible solution for automatic quality inspections. However, collecting a large amount of annotated training data for deep learning is expensive and time-consuming, especially for processes involving various products and human activities such as assembly. To address this challenge, we propose a method for automated assembly quality inspection using synthetic data generated from computer-aided design (CAD) models. The method involves two steps: automatic data generation and model implementation. In the first step, we generate synthetic data in two formats: two-dimensional (2D) images and three-dimensional (3D) point clouds. In the second step, we apply different state-of-the-art deep learning approaches to the data for quality inspection, including unsupervised domain adaptation, i.e., a method of adapting models across different data distributions, and transfer learning, which transfers knowledge between related tasks. We evaluate the methods in a case study of pedal car front-wheel assembly quality inspection to identify the possible optimal approach for assembly quality inspection. Our results show that the method using Transfer Learning on 2D synthetic images achieves superior performance compared with others. Specifically, it attained 95% accuracy through fine-tuning with only five annotated real images per class. With promising results, our method may be suggested for other similar quality inspection use cases. By utilizing synthetic CAD data, our method reduces the need for manual data collection and annotation. Furthermore, our method performs well on test data with different backgrounds, making it suitable for different manufacturing environments.

Place, publisher, year, edition, pages
Springer Nature, 2025
Keywords
Assembly quality inspection, Computer vision, Point cloud, Synthetic data, Transfer learning, Unsupervised domain adaptation
National Category
Computer Sciences Production Engineering, Human Work Science and Ergonomics
Identifiers
urn:nbn:se:kth:diva-363099 (URN)10.1007/s10845-024-02375-6 (DOI)001205028300001 ()2-s2.0-105002924620 (Scopus ID)
Note

QC 20250506

Available from: 2025-05-06 Created: 2025-05-06 Last updated: 2025-05-19Bibliographically approved
Zhang, Y., Rajabi, N., Taleb, F., Matviienko, A., Ma, Y., Björkman, M. & Kragic, D. (2025). Mind Meets Robots: A Review of EEG-Based Brain-Robot Interaction Systems. International Journal of Human-Computer Interaction, 1-32
Open this publication in new window or tab >>Mind Meets Robots: A Review of EEG-Based Brain-Robot Interaction Systems
Show others...
2025 (English)In: International Journal of Human-Computer Interaction, ISSN 1044-7318, E-ISSN 1532-7590, p. 1-32Article in journal (Refereed) Published
Abstract [en]

Brain-robot interaction (BRI) empowers individuals to control (semi-)automated machines through brain activity, either passively or actively. In the past decade, BRI systems have advanced significantly, primarily leveraging electroencephalogram (EEG) signals. This article presents an up-to-date review of 87 curated studies published between 2018 and 2023, identifying the research landscape of EEG-based BRI systems. The review consolidates methodologies, interaction modes, application contexts, system evaluation, existing challenges, and future directions in this domain. Based on our analysis, we propose a BRI system model comprising three entities: Brain, Robot, and Interaction, depicting their internal relationships. We especially examine interaction modes between human brains and robots, an aspect not yet fully explored. Within this model, we scrutinize and classify current research, extract insights, highlight challenges, and offer recommendations for future studies. Our findings provide a structured design space for human-robot interaction (HRI), informing the development of more efficient BRI frameworks.

Place, publisher, year, edition, pages
Informa UK Limited, 2025
Keywords
EEG based, brain-robot interaction, interaction mode, comprehensive review
National Category
Vehicle and Aerospace Engineering
Identifiers
urn:nbn:se:kth:diva-361866 (URN)10.1080/10447318.2025.2464915 (DOI)001446721000001 ()2-s2.0-105000309480 (Scopus ID)
Note

QC 20250402

Available from: 2025-04-02 Created: 2025-04-02 Last updated: 2025-04-02Bibliographically approved
Khanna, P., Naoum, A., Yadollahi, E., Björkman, M. & Smith, C. (2025). REFLEX Dataset: A Multimodal Dataset of Human Reactions to Robot Failures and Explanations. In: Proceedings of the 2025 ACM/IEEE International Conference on Human-Robot Interaction: . Paper presented at ACM/IEEE International Conference on Human-Robot Interaction, HRI, Melbourne, Australia, March 4-6, 2025 (pp. 1032-1036). IEEE
Open this publication in new window or tab >>REFLEX Dataset: A Multimodal Dataset of Human Reactions to Robot Failures and Explanations
Show others...
2025 (English)In: Proceedings of the 2025 ACM/IEEE International Conference on Human-Robot Interaction, IEEE , 2025, p. 1032-1036Conference paper, Published paper (Refereed)
Abstract [en]

This work presents REFLEX: Robotic Explanations to FaiLures and Human EXpressions, a comprehensive multimodal dataset capturing human reactions to robot failures and subsequent explanations in collaborative settings. It aims to facilitate research into human-robot interaction dynamics, addressing the need to study reactions to both initial failures and explanations, as well as the evolution of these reactions in long-term interactions. By providing rich, annotated data on human responses to different types of failures, explanation levels, and explanation varying strategies, the dataset contributes to the development of more robust, adaptive, and satisfying robotic systems capable of maintaining positive relationships with human collaborators, even during challenges like repeated failures

Place, publisher, year, edition, pages
IEEE, 2025
Keywords
Human Robot Interaction, Dataset, Robotic Failures, Explainable AI.
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-360946 (URN)10.5555/3721488.3721616 (DOI)
Conference
ACM/IEEE International Conference on Human-Robot Interaction, HRI, Melbourne, Australia, March 4-6, 2025
Note

QC 20250310

Available from: 2025-03-06 Created: 2025-03-06 Last updated: 2025-03-10Bibliographically approved
Khanna, P., Naoum, A., Yadollahi, E., Björkman, M. & Smith, C. (2025). REFLEX Dataset: A Multimodal Dataset of Human Reactions to Robot Failures and Explanations. In: HRI 2025 - Proceedings of the 2025 ACM/IEEE International Conference on Human-Robot Interaction: . Paper presented at 20th Annual ACM/IEEE International Conference on Human-Robot Interaction, HRI 2025, Melbourne, Australia, March 4-6, 2025 (pp. 1032-1036). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>REFLEX Dataset: A Multimodal Dataset of Human Reactions to Robot Failures and Explanations
Show others...
2025 (English)In: HRI 2025 - Proceedings of the 2025 ACM/IEEE International Conference on Human-Robot Interaction, Institute of Electrical and Electronics Engineers (IEEE) , 2025, p. 1032-1036Conference paper, Published paper (Refereed)
Abstract [en]

This work presents REFLEX: Robotic Explanations to FaiLures and Human EXpressions, a comprehensive multimodal dataset capturing human reactions to robot failures and subsequent explanations in collaborative settings. It aims to facilitate research into human-robot interaction dynamics, addressing the need to study reactions to both initial failures and explanations, as well as the evolution of these reactions in long-term interactions. By providing rich, annotated data on human responses to different types of failures, explanation levels, and explanation varying strategies, the dataset contributes to the development of more robust, adaptive, and satisfying robotic systems capable of maintaining positive relationships with human collaborators, even during challenges like repeated failures.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2025
Keywords
Dataset, Explainable AI, Human Robot Interaction, Robotic Failures
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-363769 (URN)10.1109/HRI61500.2025.10974185 (DOI)2-s2.0-105004877597 (Scopus ID)
Conference
20th Annual ACM/IEEE International Conference on Human-Robot Interaction, HRI 2025, Melbourne, Australia, March 4-6, 2025
Note

Part of ISBN 9798350378931

QC 20250526

Available from: 2025-05-21 Created: 2025-05-21 Last updated: 2025-05-26Bibliographically approved
Zhou, S., Hernandez, A. c., Gomez, C., Yin, W. & Björkman, M. (2025). SmartTBD: Smart Tracking for Resource-constrained Object Detection. ACM Transactions on Embedded Computing Systems, 24(2), Article ID 24.
Open this publication in new window or tab >>SmartTBD: Smart Tracking for Resource-constrained Object Detection
Show others...
2025 (English)In: ACM Transactions on Embedded Computing Systems, ISSN 1539-9087, E-ISSN 1558-3465, Vol. 24, no 2, article id 24Article in journal (Refereed) Published
Abstract [en]

With the growing demand for video analysis on mobile devices, object tracking has demonstrated to be a suitable assistance to object detection under the Tracking-By-Detection (TBD) paradigm for reducing computational overhead and power demands. However, performing TBD with fixed hyper-parameters leads to computational inefficiency and ignores perceptual dynamics, as fixed setups tend to run suboptimally, given the variability of scenarios. In this article, we propose SmartTBD, a scheduling strategy for TBD based on multi-objective optimization of accuracy-latency metrics. SmartTBD is a novel deep reinforcement learning based scheduling architecture that computes appropriate TBD configurations in video sequences to improve the speed and detection accuracy. This involves a challenging optimization problem due to the intrinsic relation between the video characteristics and the TBD performance. Therefore, we leverage video characteristics, frame information, and the past TBD results to drive the optimization problem. Our approach surpasses baselines with fixed TBD configurations and recent research, achieving accuracy comparable to pure detection while significantly reducing latency. Moreover, it enables performance analysis of tracking and detection in diverse scenarios. The method is proven to be generalizable and highly practical in common video analytics datasets on resource-constrained devices.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2025
Keywords
Mobile vision, tracking-by-detection, scheduling
National Category
Telecommunications
Identifiers
urn:nbn:se:kth:diva-362957 (URN)10.1145/3703912 (DOI)001454951000008 ()2-s2.0-105003605284 (Scopus ID)
Note

QC 20250505

Available from: 2025-05-05 Created: 2025-05-05 Last updated: 2025-05-27Bibliographically approved
Taleb, F., Vasco, M., Ribeiro, A. H., Björkman, M. & Kragic Jensfelt, D. (2024). Can Transformers Smell Like Humans?. In: Advances in Neural Information Processing Systems 37 - 38th Conference on Neural Information Processing Systems, NeurIPS 2024: . Paper presented at 38th Conference on Neural Information Processing Systems, NeurIPS 2024, Vancouver, Canada, Dec 9 2024 - Dec 15 2024. Neural Information Processing Systems Foundation
Open this publication in new window or tab >>Can Transformers Smell Like Humans?
Show others...
2024 (English)In: Advances in Neural Information Processing Systems 37 - 38th Conference on Neural Information Processing Systems, NeurIPS 2024, Neural Information Processing Systems Foundation , 2024Conference paper, Published paper (Refereed)
Abstract [en]

The human brain encodes stimuli from the environment into representations that form a sensory perception of the world. Despite recent advances in understanding visual and auditory perception, olfactory perception remains an under-explored topic in the machine learning community due to the lack of large-scale datasets annotated with labels of human olfactory perception. In this work, we ask the question of whether pre-trained transformer models of chemical structures encode representations that are aligned with human olfactory perception, i.e., can transformers smell like humans? We demonstrate that representations encoded from transformers pre-trained on general chemical structures are highly aligned with human olfactory perception. We use multiple datasets and different types of perceptual representations to show that the representations encoded by transformer models are able to predict: (i) labels associated with odorants provided by experts; (ii) continuous ratings provided by human participants with respect to pre-defined descriptors; and (iii) similarity ratings between odorants provided by human participants. Finally, we evaluate the extent to which this alignment is associated with physicochemical features of odorants known to be relevant for olfactory decoding.

Place, publisher, year, edition, pages
Neural Information Processing Systems Foundation, 2024
National Category
Neurosciences Computer Sciences
Identifiers
urn:nbn:se:kth:diva-361995 (URN)2-s2.0-105000466521 (Scopus ID)
Conference
38th Conference on Neural Information Processing Systems, NeurIPS 2024, Vancouver, Canada, Dec 9 2024 - Dec 15 2024
Note

QC 20250408

Available from: 2025-04-03 Created: 2025-04-03 Last updated: 2025-04-08Bibliographically approved
Taleb, F., Vasco, M., Rajabi, N., Björkman, M. & Kragic, D. (2024). Challenging Deep Learning Methods for EEG Signal Denoising under Data Corruption. In: 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2024 - Proceedings: . Paper presented at 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2024, Orlando, United States of America, Jul 15 2024 - Jul 19 2024. Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Challenging Deep Learning Methods for EEG Signal Denoising under Data Corruption
Show others...
2024 (English)In: 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2024 - Proceedings, Institute of Electrical and Electronics Engineers (IEEE) , 2024Conference paper, Published paper (Refereed)
Abstract [en]

Capturing informative electroencephalogram (EEG) signals is a challenging task due to the presence of noise (e.g., due to human movement). In extreme cases, data recordings from specific electrodes (channels) can become corrupted and entirely devoid of information. Motivated by recent work on deep-learning-based approaches for EEG signal denoising, we present the first benchmark study on the performance of EEG signal denoising methods in the presence of corrupted channels. We design our study considering a wide variety of datasets, models, and evaluation tasks. Our results highlight the need for assessing the performance of EEG deep-learning models across a broad suite of datasets, as provided by our benchmark.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
data corruption, deep learning, EEG, signal denoising, signal noise
National Category
Signal Processing Computer Sciences
Identifiers
urn:nbn:se:kth:diva-358866 (URN)10.1109/EMBC53108.2024.10782132 (DOI)40039138 (PubMedID)2-s2.0-85214969123 (Scopus ID)
Conference
46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2024, Orlando, United States of America, Jul 15 2024 - Jul 19 2024
Note

Part of ISBN 9798350371499]

QC 20250128

Available from: 2025-01-23 Created: 2025-01-23 Last updated: 2025-05-27Bibliographically approved
Longhini, A., Büsching, M., Duisterhof, B. P., Lundell, J., Ichnowski, J., Björkman, M. & Kragic, D. (2024). Cloth-Splatting: 3D Cloth State Estimation from RGB Supervision. In: Proceedings of the 8th Conference on Robot Learning, CoRL 2024: . Paper presented at 8th Annual Conference on Robot Learning, November 6-9, 2024, Munich, Germany (pp. 2845-2865). ML Research Press
Open this publication in new window or tab >>Cloth-Splatting: 3D Cloth State Estimation from RGB Supervision
Show others...
2024 (English)In: Proceedings of the 8th Conference on Robot Learning, CoRL 2024, ML Research Press , 2024, p. 2845-2865Conference paper, Published paper (Refereed)
Abstract [en]

We introduce Cloth-Splatting, a method for estimating 3D states of cloth from RGB images through a prediction-update framework. Cloth-Splatting leverages an action-conditioned dynamics model for predicting future states and uses 3D Gaussian Splatting to update the predicted states. Our key insight is that coupling a 3D mesh-based representation with Gaussian Splatting allows us to define a differentiable map between the cloth's state space and the image space. This enables the use of gradient-based optimization techniques to refine inaccurate state estimates using only RGB supervision. Our experiments demonstrate that Cloth-Splatting not only improves state estimation accuracy over current baselines but also reduces convergence time by ∼85 %.

Place, publisher, year, edition, pages
ML Research Press, 2024
Keywords
3D State Estimation, Gaussian Splatting, Vision-based Tracking, Deformable Objects
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:kth:diva-357192 (URN)2-s2.0-86000735293 (Scopus ID)
Conference
8th Annual Conference on Robot Learning, November 6-9, 2024, Munich, Germany
Note

QC 20250328

Available from: 2024-12-04 Created: 2024-12-04 Last updated: 2025-03-28Bibliographically approved
Khanna, P., Fredberg, J., Björkman, M., Smith, C. & Linard, A. (2024). Hand it to me formally! Data-driven control for human-robot handovers with signal temporal logic. IEEE Robotics and Automation Letters, 9(10), 9039-9046
Open this publication in new window or tab >>Hand it to me formally! Data-driven control for human-robot handovers with signal temporal logic
Show others...
2024 (English)In: IEEE Robotics and Automation Letters, E-ISSN 2377-3766, Vol. 9, no 10, p. 9039-9046Article in journal (Refereed) Published
Abstract [en]

To facilitate human-robot interaction (HRI), we aim for robot behavior that is efficient, transparent, and closely resembles human actions. Signal Temporal Logic (STL) is a formal language that enables the specification and verification of complex temporal properties in robotic systems, helping to ensure their correctness. STL can be used to generate explainable robot behaviour, the degree of satisfaction of which can be quantified by checking its STL robustness. In this letter, we use data-driven STL inference techniques to model human behavior in human-human interactions, on a handover dataset. We then use the learned model to generate robot behavior in human-robot interactions. We present a handover planner based on inferred STL specifications to command robotic motion in human-robot handovers. We also validate our method in a human-to-robot handover experiment.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
Handover, Robots, Robot kinematics, Behavioral sciences, Trajectory, Logic, Robustness, Human-robot handovers, Signal Temporal Logic (STL)
National Category
Robotics and automation Control Engineering
Identifiers
urn:nbn:se:kth:diva-354524 (URN)10.1109/LRA.2024.3447476 (DOI)001316210300020 ()2-s2.0-85201769650 (Scopus ID)
Note

QC 20241011

Available from: 2024-10-11 Created: 2024-10-11 Last updated: 2025-02-05Bibliographically approved
Reichlin, A., Tegner, G., Vasco, M., Yin, H., Björkman, M. & Kragic, D. (2024). Reducing Variance in Meta-Learning via Laplace Approximation for Regression Tasks. Transactions on Machine Learning Research, 2024
Open this publication in new window or tab >>Reducing Variance in Meta-Learning via Laplace Approximation for Regression Tasks
Show others...
2024 (English)In: Transactions on Machine Learning Research, E-ISSN 2835-8856, Vol. 2024Article in journal (Refereed) Published
Abstract [en]

Given a finite set of sample points, meta-learning algorithms aim to learn an optimal adaptation strategy for new, unseen tasks. Often, this data can be ambiguous as it might belong to different tasks concurrently. This is particularly the case in meta-regression tasks. In such cases, the estimated adaptation strategy is subject to high variance due to the limited amount of support data for each task, which often leads to sub-optimal generalization performance. In this work, we address the problem of variance reduction in gradient-based meta-learning and formalize the class of problems prone to this, a condition we refer to as task overlap. Specifically, we propose a novel approach that reduces the variance of the gradient estimate by weighing each support point individually by the variance of its posterior over the parameters. To estimate the posterior, we utilize the Laplace approximation, which allows us to express the variance in terms of the curvature of the loss landscape of our meta-learner. Experimental results demonstrate the effectiveness of the proposed method and highlight the importance of variance reduction in meta-learning.

Place, publisher, year, edition, pages
Transactions on Machine Learning Research, 2024
National Category
Robotics and automation Control Engineering
Identifiers
urn:nbn:se:kth:diva-361197 (URN)2-s2.0-85219566964 (Scopus ID)
Note

QC 20250312

Available from: 2025-03-12 Created: 2025-03-12 Last updated: 2025-03-12Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0003-0579-3372

Search in DiVA

Show all publications