kth.sePublications
Change search
Link to record
Permanent link

Direct link
Publications (5 of 5) Show all publications
Zhang, Q., Yang, Y., Li, P., Andersson, O. & Jensfelt, P. (2025). SeFlow: A Self-supervised Scene Flow Method in Autonomous Driving. In: Roth, S Russakovsky, O Sattler, T Varol, G Leonardis, A Ricci, E (Ed.), COMPUTER VISION-ECCV 2024, PT I: . Paper presented at 18th European Conference on Computer Vision (ECCV), SEP 29-OCT 04, 2024, Milan, ITALY (pp. 353-369). Springer Nature, 15059
Open this publication in new window or tab >>SeFlow: A Self-supervised Scene Flow Method in Autonomous Driving
Show others...
2025 (English)In: COMPUTER VISION-ECCV 2024, PT I / [ed] Roth, S Russakovsky, O Sattler, T Varol, G Leonardis, A Ricci, E, Springer Nature , 2025, Vol. 15059, p. 353-369Conference paper, Published paper (Refereed)
Abstract [en]

Scene flow estimation predicts the 3D motion at each point in successive LiDAR scans. This detailed, point-level, information can help autonomous vehicles to accurately predict and understand dynamic changes in their surroundings. Current state-of-the-art methods require annotated data to train scene flow networks and the expense of labeling inherently limits their scalability. Self-supervised approaches can overcome the above limitations, yet face two principal challenges that hinder optimal performance: point distribution imbalance and disregard for object-level motion constraints. In this paper, we propose SeFlow, a self-supervised method that integrates efficient dynamic classification into a learning-based scene flow pipeline. We demonstrate that classifying static and dynamic points helps design targeted objective functions for different motion patterns. We also emphasize the importance of internal cluster consistency and correct object point association to refine the scene flow estimation, in particular on object details. Our real-time capable method achieves state-of-the-art performance on the self-supervised scene flow task on Argoverse 2 and Waymo datasets. The code is open-sourced at https://github.com/KTH-RPL/SeFlow.

Place, publisher, year, edition, pages
Springer Nature, 2025
Series
Lecture Notes in Computer Science, ISSN 0302-9743 ; 15059
Keywords
3D scene flow, self-supervised, autonomous driving, large-scale point cloud
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:kth:diva-357529 (URN)10.1007/978-3-031-73232-4_20 (DOI)001346378300020 ()2-s2.0-85206389477 (Scopus ID)
Conference
18th European Conference on Computer Vision (ECCV), SEP 29-OCT 04, 2024, Milan, ITALY
Note

Part of ISBN 978-3-031-73231-7; 978-3-031-73232-4

QC 20241209

Available from: 2024-12-09 Created: 2024-12-09 Last updated: 2025-02-07Bibliographically approved
Yang, Y., Zhang, Q., Ikemura, K., Batool, N. & Folkesson, J. (2024). Hard Cases Detection in Motion Prediction by Vision-Language Foundation Models. In: 35th IEEE Intelligent Vehicles Symposium, IV 2024: . Paper presented at 35th IEEE Intelligent Vehicles Symposium, IV 2024, Jeju Island, Korea, Jun 2 2024 - Jun 5 2024 (pp. 2405-2412). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Hard Cases Detection in Motion Prediction by Vision-Language Foundation Models
Show others...
2024 (English)In: 35th IEEE Intelligent Vehicles Symposium, IV 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024, p. 2405-2412Conference paper, Published paper (Refereed)
Abstract [en]

Addressing hard cases in autonomous driving, such as anomalous road users, extreme weather conditions, and complex traffic interactions, presents significant challenges. To ensure safety, it is crucial to detect and manage these scenarios effectively for autonomous driving systems. However, the rarity and high-risk nature of these cases demand extensive, diverse datasets for training robust models. Vision-Language Foundation Models (VLMs) have shown remarkable zero-shot capabilities as being trained on extensive datasets. This work explores the potential of VLMs in detecting hard cases in autonomous driving. We demonstrate the capability of VLMs such as GPT-4v in detecting hard cases in traffic participant motion prediction on both agent and scenario levels. We introduce a feasible pipeline where VLMs, fed with sequential image frames with designed prompts, effectively identify challenging agents or scenarios, which are verified by existing prediction models. Moreover, by taking advantage of this detection of hard cases by VLMs, we further improve the training efficiency of the existing motion prediction pipeline by performing data selection for the training samples suggested by GPT. We show the effectiveness and feasibility of our pipeline incorporating VLMs with state-of-the-art methods on NuScenes datasets. The code is accessible at https://github.com/KTH-RPL/Detect-VLM.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:kth:diva-351756 (URN)10.1109/IV55156.2024.10588694 (DOI)001275100902068 ()2-s2.0-85199784263 (Scopus ID)
Conference
35th IEEE Intelligent Vehicles Symposium, IV 2024, Jeju Island, Korea, Jun 2 2024 - Jun 5 2024
Note

Part of ISBN [9798350348811]

QC 20240815

Available from: 2024-08-13 Created: 2024-08-13 Last updated: 2025-02-07Bibliographically approved
Yang, Y., Zhang, Q., Li, C., Simões Marta, D., Batool, N. & Folkesson, J. (2024). Human-Centric Autonomous Systems With LLMs for User Command Reasoning. In: 2024 Ieee Winter Conference On Applications Of Computer Vision Workshops, Wacvw 2024: . Paper presented at IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), JAN 04-08, 2024, Waikoloa, HI (pp. 988-994). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Human-Centric Autonomous Systems With LLMs for User Command Reasoning
Show others...
2024 (English)In: 2024 Ieee Winter Conference On Applications Of Computer Vision Workshops, Wacvw 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024, p. 988-994Conference paper, Published paper (Refereed)
Abstract [en]

The evolution of autonomous driving has made remarkable advancements in recent years, evolving into a tangible reality. However, a human-centric large-scale adoption hinges on meeting a variety of multifaceted requirements. To ensure that the autonomous system meets the user's intent, it is essential to accurately discern and interpret user commands, especially in complex or emergency situations. To this end, we propose to leverage the reasoning capabilities of Large Language Models (LLMs) to infer system requirements from in-cabin users' commands. Through a series of experiments that include different LLM models and prompt designs, we explore the few-shot multivariate binary classification accuracy of system requirements from natural language textual commands. We confirm the general ability of LLMs to understand and reason about prompts but underline that their effectiveness is conditioned on the quality of both the LLM model and the design of appropriate sequential prompts. Code and models are public with the link https://github.com/KTH-RPL/DriveCmd_LLM.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Series
IEEE Winter Conference on Applications of Computer Vision Workshops, ISSN 2572-4398
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-351635 (URN)10.1109/WACVW60836.2024.00108 (DOI)001223022200040 ()2-s2.0-85188691382 (Scopus ID)
Conference
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), JAN 04-08, 2024, Waikoloa, HI
Note

QC 20240813

Part of ISBN 979-8-3503-7028-7, 979-8-3503-7071-3

Available from: 2024-08-13 Created: 2024-08-13 Last updated: 2024-10-11Bibliographically approved
Zhang, Q., Duberg, D., Geng, R., Jia, M., Wang, L. & Jensfelt, P. (2023). A Dynamic Points Removal Benchmark in Point Cloud Maps. In: 2023 IEEE 26th International Conference on Intelligent Transportation Systems, ITSC 2023: . Paper presented at 26th IEEE International Conference on Intelligent Transportation Systems, ITSC 2023, Bilbao, Spain, Sep 24 2023 - Sep 28 2023 (pp. 608-614). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>A Dynamic Points Removal Benchmark in Point Cloud Maps
Show others...
2023 (English)In: 2023 IEEE 26th International Conference on Intelligent Transportation Systems, ITSC 2023, Institute of Electrical and Electronics Engineers (IEEE) , 2023, p. 608-614Conference paper, Published paper (Refereed)
Abstract [en]

In the field of robotics, the point cloud has become an essential map representation. From the perspective of downstream tasks like localization and global path planning, points corresponding to dynamic objects will adversely affect their performance. Existing methods for removing dynamic points in point clouds often lack clarity in comparative evaluations and comprehensive analysis. Therefore, we propose an easy-to-extend unified benchmarking framework for evaluating techniques for removing dynamic points in maps. It includes refactored state-of-art methods and novel metrics to analyze the limitations of these approaches. This enables researchers to dive deep into the underlying reasons behind these limitations. The benchmark makes use of several datasets with different sensor types. All the code and datasets related to our study are publicly available for further development and utilization.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2023
National Category
Robotics and automation
Identifiers
urn:nbn:se:kth:diva-344365 (URN)10.1109/ITSC57777.2023.10422094 (DOI)2-s2.0-85186537890 (Scopus ID)
Conference
26th IEEE International Conference on Intelligent Transportation Systems, ITSC 2023, Bilbao, Spain, Sep 24 2023 - Sep 28 2023
Note

Part of ISBN 9798350399462

QC 20240315

Available from: 2024-03-13 Created: 2024-03-13 Last updated: 2025-02-09Bibliographically approved
Yang, Y., Zhang, Q., Gilles, T., Batool, N. & Folkesson, J. (2023). RMP: A Random Mask Pretrain Framework for Motion Prediction. In: 2023 IEEE 26th International Conference on Intelligent Transportation Systems, ITSC 2023: . Paper presented at 26th IEEE International Conference on Intelligent Transportation Systems, ITSC 2023, Bilbao, Spain, Sep 24 2023 - Sep 28 2023 (pp. 3717-3723). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>RMP: A Random Mask Pretrain Framework for Motion Prediction
Show others...
2023 (English)In: 2023 IEEE 26th International Conference on Intelligent Transportation Systems, ITSC 2023, Institute of Electrical and Electronics Engineers (IEEE) , 2023, p. 3717-3723Conference paper, Published paper (Refereed)
Abstract [en]

As the pretraining technique is growing in popularity, little work has been done on pretrained learning-based motion prediction methods in autonomous driving. In this paper, we propose a framework to formalize the pretraining task for trajectory prediction of traffic participants. Within our framework, inspired by the random masked model in natural language processing (NLP) and computer vision (CV), objects' positions at random timesteps are masked and then filled in by the learned neural network (NN). By changing the mask profile, our framework can easily switch among a range of motion-related tasks. We show that our proposed pretraining framework is able to deal with noisy inputs and improves the motion prediction accuracy and miss rate, especially for objects occluded over time by evaluating it on Argoverse and NuScenes datasets.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2023
Series
IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC, ISSN 2153-0009
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:kth:diva-344363 (URN)10.1109/ITSC57777.2023.10422522 (DOI)001178996703113 ()2-s2.0-85186535191 (Scopus ID)
Conference
26th IEEE International Conference on Intelligent Transportation Systems, ITSC 2023, Bilbao, Spain, Sep 24 2023 - Sep 28 2023
Note

Part of ISBN 979-835039946-2

QC 20240315

Available from: 2024-03-13 Created: 2024-03-13 Last updated: 2025-02-07Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-7882-948X

Search in DiVA

Show all publications