kth.sePublications
Change search
Link to record
Permanent link

Direct link
Folkesson, John, Associate ProfessorORCID iD iconorcid.org/0000-0002-7796-1438
Publications (10 of 103) Show all publications
Zhang, J., Xie, Y., Ling, L. & Folkesson, J. (2025). A Dense Subframe-Based SLAM Framework With Side-Scan Sonar. IEEE Journal of Oceanic Engineering, 50(2), 1087-1102
Open this publication in new window or tab >>A Dense Subframe-Based SLAM Framework With Side-Scan Sonar
2025 (English)In: IEEE Journal of Oceanic Engineering, ISSN 0364-9059, E-ISSN 1558-1691, Vol. 50, no 2, p. 1087-1102Article in journal (Refereed) Published
Abstract [en]

Side-scan sonar (SSS) is a lightweight acoustic sensor commonly deployed on autonomous underwater vehicles (AUVs) to provide high-resolution seafloor images. However, leveraging side-scan images for simultaneous localization and mapping (SLAM) presents a notable challenge, primarily due to the difficulty of establishing a sufficient number of accurate correspondences between these images. To address this, we introduce a novel subframe-based dense SLAM framework utilizing SSS data, enabling effective dense matching in overlapping regions of paired side-scan images. With each image being evenly divided into subframes, we propose a robust estimation pipeline to estimate the relative pose between each paired subframe using a good inlier set identified from dense correspondences. These relative poses are then integrated as edge constraints in a factor graph to optimize the AUV pose trajectory. The proposed framework is evaluated on three real data sets collected by a Hugin AUV. One of these data sets contains manually annotated keypoint correspondences as ground truth and is used for the evaluation of pose trajectory. We also present a feasible way of evaluating mapping quality against multi-beam echosounder data without the influence of pose. Experimental results demonstrate that our approach effectively mitigates drift from the dead-reckoning system and enables quasi-dense bathymetry reconstruction.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2025
Keywords
Autonomous underwater vehicle (AUV), dense matching, factor graph, quasi-dense bathymetry, side-scan sonar (SSS), simultaneous localization and mapping (SLAM), subframe
National Category
Robotics and automation Computer graphics and computer vision Control Engineering
Identifiers
urn:nbn:se:kth:diva-363113 (URN)10.1109/JOE.2024.3503663 (DOI)001385777600001 ()2-s2.0-105003293381 (Scopus ID)
Note

QC 20250506

Available from: 2025-05-06 Created: 2025-05-06 Last updated: 2025-05-06Bibliographically approved
Xie, Y., Zhang, J., Bore, N. & Folkesson, J. (2025). NeuRSS: Enhancing AUV Localization and Bathymetric Mapping With Neural Rendering for Sidescan SLAM. IEEE Journal of Oceanic Engineering, 1-10
Open this publication in new window or tab >>NeuRSS: Enhancing AUV Localization and Bathymetric Mapping With Neural Rendering for Sidescan SLAM
2025 (English)In: IEEE Journal of Oceanic Engineering, ISSN 0364-9059, E-ISSN 1558-1691, p. 1-10Article in journal (Refereed) Epub ahead of print
Abstract [en]

Implicit neural representations and neural rendering have gained increasing attention for bathymetry estimation from sidescan sonar (SSS). These methods incorporate multiple observations of the same place from SSS data to constrain the elevation estimate, converging to a globallynt bathymetric model. However, the quality and precision of the bathymetric estimate are limited by the positioning accuracy of the autonomous underwater vehicle (AUV) equipped with the sonar. The global positioning estimate of the AUV relying on dead reckoning (DR) has an unbounded error due to the absence of a geo-reference system like GPS underwater. To address this challenge, we propose in this article a modern and scalable framework, NeuRSS, for SSS SLAM based on DR and loop closures (LCs) over large timescales, with an elevation prior provided by the bathymetric estimate using neural rendering from SSS. This framework is an iterative procedure that improves localization and bathymetric mapping. Initially, the bathymetry estimated from SSS using the DR estimate, though crude, can provide an important elevation prior in the nonlinear least-squares (NLSs) optimization that estimates the relative pose between two LC vertices in a pose graph. Subsequently, the global pose estimate from the SLAM component improves the positioning estimate of the vehicle, thus improving the bathymetry estimation. We validate our localization and mapping approach on two large surveys collected with a surface vessel and an AUV, respectively. We evaluate their localization results against the ground truth and compare the bathymetry estimation against data collected with multibeam echo sounders (MBESs).

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2025
National Category
Robotics and automation
Identifiers
urn:nbn:se:kth:diva-365099 (URN)10.1109/joe.2024.3501317 (DOI)001395154900001 ()2-s2.0-85214477756 (Scopus ID)
Funder
Knut and Alice Wallenberg FoundationSwedish Foundation for Strategic Research
Note

QC 20250701

Available from: 2025-06-18 Created: 2025-06-18 Last updated: 2025-07-01Bibliographically approved
Zhang, J., Xie, Y., Ling, L. & Folkesson, J. (2024). A fully-automatic side-scan sonar simultaneous localization and mapping framework. IET radar, sonar & navigation, 18(5), 674-683
Open this publication in new window or tab >>A fully-automatic side-scan sonar simultaneous localization and mapping framework
2024 (English)In: IET radar, sonar & navigation, ISSN 1751-8784, E-ISSN 1751-8792, Vol. 18, no 5, p. 674-683Article in journal (Refereed) Published
Abstract [en]

Side-scan sonar is a lightweight acoustic sensor that is frequently deployed on autonomous underwater vehicles (AUVs) to provide high-resolution seafloor images. However, using side-scan images to perform simultaneous localization and mapping (SLAM) remains a challenge when there is a lack of 3D bathymetric information and discriminant features in the side-scan images. To tackle this, the authors propose a feature-based SLAM framework using side-scan sonar, which is able to automatically detect and robustly match keypoints between paired side-scan images. The authors then use the detected correspondences as constraints to optimise the AUV pose trajectory. The proposed method is evaluated on real data collected by a Hugin AUV, using as a ground truth reference both manually-annotated keypoints and a 3D bathymetry mesh from multibeam echosounder (MBES). Experimental results demonstrate that this approach is able to reduce drifts from the dead-reckoning system. The framework is made publicly available for the benefit of the community.

Place, publisher, year, edition, pages
Institution of Engineering and Technology (IET), 2024
Keywords
autonomous underwater vehicles, feature extraction, geometry, marine navigation, optimisation, pattern matching, sonar imaging
National Category
Control Engineering
Identifiers
urn:nbn:se:kth:diva-350288 (URN)10.1049/rsn2.12500 (DOI)001101568100001 ()2-s2.0-85176947901 (Scopus ID)
Note

QC 20240711

Available from: 2024-07-11 Created: 2024-07-11 Last updated: 2024-07-11Bibliographically approved
Xie, Y., Troni, G., Bore, N. & Folkesson, J. (2024). Bathymetric Surveying With Imaging Sonar Using Neural Volume Rendering. IEEE Robotics and Automation Letters, 9(9), 8146-8153
Open this publication in new window or tab >>Bathymetric Surveying With Imaging Sonar Using Neural Volume Rendering
2024 (English)In: IEEE Robotics and Automation Letters, E-ISSN 2377-3766, Vol. 9, no 9, p. 8146-8153Article in journal (Refereed) Published
Abstract [en]

This research addresses the challenge of estimating bathymetry from imaging sonars where the state-of-the-art works have primarily relied on either supervised learning with ground-truth labels or surface rendering based on the Lambertian assumption. In this letter, we propose a novel, self-supervised framework based on volume rendering for reconstructing bathymetry using forward-looking sonar (FLS) data collected during standard surveys. We represent the seafloor as a neural heightmap encapsulated with a parametric multi-resolution hash encoding scheme and model the sonar measurements with a differentiable renderer using sonar volumetric rendering employed with hierarchical sampling techniques. Additionally, we model the horizontal and vertical beam patterns and estimate them jointly with the bathymetry. We evaluate the proposed method quantitatively on simulation and field data collected by remotely operated vehicles (ROVs) during low-altitude surveys. Results show that the proposed method outperforms the current state-of-the-art approaches that use imaging sonars for seabed mapping. We also demonstrate that the proposed approach can potentially be used to increase the resolution of a low-resolution prior map with FLS data from low-altitude surveys.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
Sonar, Bathymetry, Image reconstruction, Three-dimensional displays, Rendering (computer graphics), Surveys, Encoding, Bathymetric reconstruction, deep learning methods, deep learning for visual perception, mapping, marine robotics
National Category
Robotics and automation
Identifiers
urn:nbn:se:kth:diva-352714 (URN)10.1109/LRA.2024.3440843 (DOI)001294338600006 ()2-s2.0-85200810845 (Scopus ID)
Note

QC 20240905

Available from: 2024-09-05 Created: 2024-09-05 Last updated: 2025-02-09Bibliographically approved
Ling, L., Zhang, J., Bore, N., Folkesson, J. & Wåhlin, A. (2024). Benchmarking classical and learning-based multibeam point cloud registration. In: 2024 IEEE International Conference on Robotics and Automation, ICRA 2024: . Paper presented at 2024 IEEE International Conference on Robotics and Automation, ICRA 2024, May 13-17, 2024, Yokohama, Japan (pp. 6118-6125). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Benchmarking classical and learning-based multibeam point cloud registration
Show others...
2024 (English)In: 2024 IEEE International Conference on Robotics and Automation, ICRA 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024, p. 6118-6125Conference paper, Published paper (Refereed)
Abstract [en]

Deep learning has shown promising results for multiple 3D point cloud registration datasets. However, in the underwater domain, most registration of multibeam echo-sounder (MBES) point cloud data are still performed using classical methods in the iterative closest point (ICP) family. In this work, we curate and release DotsonEast Dataset, a semi-synthetic MBES registration dataset constructed from an autonomous underwater vehicle in West Antarctica. Using this dataset, we systematically benchmark the performance of 2 classical and 4 learning-based methods. The experimental results show that the learning-based methods work well for coarse alignment, and are better at recovering rough transforms consistently at high overlap (20-50%). In comparison, GICP (a variant of ICP) performs well for fine alignment and is better across all metrics at extremely low overlap (10%). To the best of our knowledge, this is the first work to benchmark both learning-based and classical registration methods on an AUV-based MBES dataset. To facilitate future research, both the code and data are made available online.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-353554 (URN)10.1109/ICRA57147.2024.10610118 (DOI)2-s2.0-85202430012 (Scopus ID)
Conference
2024 IEEE International Conference on Robotics and Automation, ICRA 2024, May 13-17, 2024, Yokohama, Japan
Note

Part of ISBN: 9798350384574

QC 20240926

Available from: 2024-09-19 Created: 2024-09-19 Last updated: 2024-09-26Bibliographically approved
Terán Espinoza, A., Terán Espinoza, A., Folkesson, J., Sigray, P. & Kuttenkeuler, J. (2024). Boundary Factors for Seamless State Estimation between Autonomous Underwater Docking Phases. In: 2024 IEEE International Conference on Robotics and Automation (ICRA): . Paper presented at 2024 IEEE International Conference on Robotics and Automation (ICRA), Yokohama, Japan, May 13-17 2024. Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Boundary Factors for Seamless State Estimation between Autonomous Underwater Docking Phases
Show others...
2024 (English)In: 2024 IEEE International Conference on Robotics and Automation (ICRA), Institute of Electrical and Electronics Engineers (IEEE) , 2024Conference paper, Published paper (Refereed)
Abstract [en]

Autonomous underwater docking is of the utmost importance for expanding the capabilities of Autonomous Underwater Vehicles (AUVs). Due to a historical focus on underwater docking to only static targets, the research gap in underwater docking to dynamically active targets has been left relatively untouched. We address the state estimation problem that arises when trying to rendezvous a chaser AUV with a dynamic target by modeling the scenario as a factor graph optimization-based Simultaneous Localization and Mapping problem. We present a set of boundary factors that aid the inference process by seamlessly transitioning the target’s state between the different observability stages, intrinsic to any dynamic docking scenario. We benchmark the performance of our approach using the Stonefish simulated environment.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
National Category
Robotics and automation
Identifiers
urn:nbn:se:kth:diva-365112 (URN)10.1109/ICRA57147.2024.10611552 (DOI)001369728001030 ()2-s2.0-85202452730 (Scopus ID)
Conference
2024 IEEE International Conference on Robotics and Automation (ICRA), Yokohama, Japan, May 13-17 2024
Funder
Swedish Foundation for Strategic Research
Note

This work was supported by the Stiftelsen för Strategisk Forskning (SSF) through the Swedish Maritime Robotics Centre (SMaRC)(IRC15-0046).

QC 20250701

Available from: 2025-06-18 Created: 2025-06-18 Last updated: 2025-07-01Bibliographically approved
Yang, Y., Zhang, Q., Ikemura, K., Batool, N. & Folkesson, J. (2024). Hard Cases Detection in Motion Prediction by Vision-Language Foundation Models. In: 35th IEEE Intelligent Vehicles Symposium, IV 2024: . Paper presented at 35th IEEE Intelligent Vehicles Symposium, IV 2024, Jeju Island, Korea, Jun 2 2024 - Jun 5 2024 (pp. 2405-2412). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Hard Cases Detection in Motion Prediction by Vision-Language Foundation Models
Show others...
2024 (English)In: 35th IEEE Intelligent Vehicles Symposium, IV 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024, p. 2405-2412Conference paper, Published paper (Refereed)
Abstract [en]

Addressing hard cases in autonomous driving, such as anomalous road users, extreme weather conditions, and complex traffic interactions, presents significant challenges. To ensure safety, it is crucial to detect and manage these scenarios effectively for autonomous driving systems. However, the rarity and high-risk nature of these cases demand extensive, diverse datasets for training robust models. Vision-Language Foundation Models (VLMs) have shown remarkable zero-shot capabilities as being trained on extensive datasets. This work explores the potential of VLMs in detecting hard cases in autonomous driving. We demonstrate the capability of VLMs such as GPT-4v in detecting hard cases in traffic participant motion prediction on both agent and scenario levels. We introduce a feasible pipeline where VLMs, fed with sequential image frames with designed prompts, effectively identify challenging agents or scenarios, which are verified by existing prediction models. Moreover, by taking advantage of this detection of hard cases by VLMs, we further improve the training efficiency of the existing motion prediction pipeline by performing data selection for the training samples suggested by GPT. We show the effectiveness and feasibility of our pipeline incorporating VLMs with state-of-the-art methods on NuScenes datasets. The code is accessible at https://github.com/KTH-RPL/Detect-VLM.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:kth:diva-351756 (URN)10.1109/IV55156.2024.10588694 (DOI)001275100902068 ()2-s2.0-85199784263 (Scopus ID)
Conference
35th IEEE Intelligent Vehicles Symposium, IV 2024, Jeju Island, Korea, Jun 2 2024 - Jun 5 2024
Note

Part of ISBN [9798350348811]

QC 20240815

Available from: 2024-08-13 Created: 2024-08-13 Last updated: 2025-02-07Bibliographically approved
Yang, Y., Zhang, Q., Li, C., Simões Marta, D., Batool, N. & Folkesson, J. (2024). Human-Centric Autonomous Systems With LLMs for User Command Reasoning. In: 2024 Ieee Winter Conference On Applications Of Computer Vision Workshops, Wacvw 2024: . Paper presented at IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), JAN 04-08, 2024, Waikoloa, HI (pp. 988-994). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Human-Centric Autonomous Systems With LLMs for User Command Reasoning
Show others...
2024 (English)In: 2024 Ieee Winter Conference On Applications Of Computer Vision Workshops, Wacvw 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024, p. 988-994Conference paper, Published paper (Refereed)
Abstract [en]

The evolution of autonomous driving has made remarkable advancements in recent years, evolving into a tangible reality. However, a human-centric large-scale adoption hinges on meeting a variety of multifaceted requirements. To ensure that the autonomous system meets the user's intent, it is essential to accurately discern and interpret user commands, especially in complex or emergency situations. To this end, we propose to leverage the reasoning capabilities of Large Language Models (LLMs) to infer system requirements from in-cabin users' commands. Through a series of experiments that include different LLM models and prompt designs, we explore the few-shot multivariate binary classification accuracy of system requirements from natural language textual commands. We confirm the general ability of LLMs to understand and reason about prompts but underline that their effectiveness is conditioned on the quality of both the LLM model and the design of appropriate sequential prompts. Code and models are public with the link https://github.com/KTH-RPL/DriveCmd_LLM.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Series
IEEE Winter Conference on Applications of Computer Vision Workshops, ISSN 2572-4398
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-351635 (URN)10.1109/WACVW60836.2024.00108 (DOI)001223022200040 ()2-s2.0-85188691382 (Scopus ID)
Conference
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), JAN 04-08, 2024, Waikoloa, HI
Note

QC 20240813

Part of ISBN 979-8-3503-7028-7, 979-8-3503-7071-3

Available from: 2024-08-13 Created: 2024-08-13 Last updated: 2024-10-11Bibliographically approved
Ling, L., Xie, Y., Bore, N. & Folkesson, J. (2024). Score-Based Multibeam Point Cloud Denoising. In: 2024 IEEE/OES Autonomous Underwater Vehicles Symposium (AUV): . Paper presented at 2024 IEEE/OES Autonomous Underwater Vehicles Symposium (AUV), Boston, MA, USA, September 18-20, 2024. Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Score-Based Multibeam Point Cloud Denoising
2024 (English)In: 2024 IEEE/OES Autonomous Underwater Vehicles Symposium (AUV), Institute of Electrical and Electronics Engineers (IEEE) , 2024Conference paper, Published paper (Refereed)
Abstract [en]

Multibeam echo-sounder (MBES) is the de-facto sensor for bathymetry mapping. In recent years, cheaper MBES sensors and global mapping initiatives have led to exponential growth of available data. However, raw MBES data contains 1 − 25% of noise that requires semi-automatic filtering using tools such as Combined Uncertainty and Bathymetric Estimator (CUBE). In this work, we draw inspirations from the 3D point cloud community and adapted a score-based point cloud denoising network for MBES outlier detection and denoising. We trained and evaluated this network on real MBES survey data. The proposed method was found to outperform classical methods, and can be readily integrated into existing MBES standard workflow. To facilitate future research, the code and pretrained model are available online

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
Point cloud compression, Surveys, Adaptation models, Uncertainty, Three-dimensional displays, Noise reduction, Noise, Bathymetry, Anomaly detection, Standards
National Category
Robotics and automation
Identifiers
urn:nbn:se:kth:diva-365100 (URN)10.1109/AUV61864.2024.11030792 (DOI)
Conference
2024 IEEE/OES Autonomous Underwater Vehicles Symposium (AUV), Boston, MA, USA, September 18-20, 2024
Note

Part of ISBN 9798331542238

QC 20250701

Available from: 2025-06-18 Created: 2025-06-18 Last updated: 2025-07-01Bibliographically approved
Xie, Y., Bore, N. & Folkesson, J. (2023). Bathymetric Reconstruction From Sidescan Sonar With Deep Neural Networks. IEEE Journal of Oceanic Engineering, 48(2), 372-383
Open this publication in new window or tab >>Bathymetric Reconstruction From Sidescan Sonar With Deep Neural Networks
2023 (English)In: IEEE Journal of Oceanic Engineering, ISSN 0364-9059, E-ISSN 1558-1691, Vol. 48, no 2, p. 372-383Article in journal (Refereed) Published
Abstract [en]

In this article, we propose a novel data-driven approach for high-resolution bathymetric reconstruction from sidescan. Sidescan sonar intensities as a function of range do contain some information about the slope of the seabed. However, that information must be inferred. In addition, the navigation system provides the estimated trajectory, and normally, the altitude along this trajectory is also available. From these, we obtain a very coarse seabed bathymetry as an input. This is then combined with the indirect but high-resolution seabed slope information from the sidescan to estimate the full bathymetry. This sparse depth could be acquired by single-beam echo sounder, Doppler velocity log, and other bottom tracking sensors or bottom tracking algorithm from sidescan itself. In our work, a fully convolutional network is used to estimate the depth contour and its aleatoric uncertainty from the sidescan images and sparse depth in an end-to-end fashion. The estimated depth is then used together with the range to calculate the point's three-dimensional location on the seafloor. A high-quality bathymetric map can be reconstructed after fusing the depth predictions and the corresponding confidence measures from the neural networks. We show the improvement of the bathymetric map gained by using sparse depths with sidescan over estimates with sidescan alone. We also show the benefit of confidence weighting when fusing multiple bathymetric estimates into a single map.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2023
Keywords
Bathymetric mapping, data-driven, neural network, sidescan sonar (SSS)
National Category
Robotics and automation
Identifiers
urn:nbn:se:kth:diva-330071 (URN)10.1109/JOE.2022.3220330 (DOI)000906218600001 ()2-s2.0-85146230440 (Scopus ID)
Note

QC 20230626

Available from: 2023-06-26 Created: 2023-06-26 Last updated: 2025-02-09Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-7796-1438

Search in DiVA

Show all publications