kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Robust Depth Enhancement via Polarization Prompt Fusion Tuning
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.
CUHK, Hong Kong, Peoples R China..
Princeton University, USA.
HKISI CAS, CAIR, Hong Kong, Peoples R China.; CASIA, Beijing, Peoples R China..
Show others and affiliations
2024 (English)In: 2024 IEEE/CVF Conference On Computer Vision And Pattern Recognition (CVPR), Institute of Electrical and Electronics Engineers (IEEE) , 2024, p. 20710-20720Conference paper, Published paper (Refereed)
Abstract [en]

Existing depth sensors are imperfect and may provide inaccurate depth values in challenging scenarios, such as in the presence of transparent or reflective objects. In this work, we present a general framework that leverages polarization imaging to improve inaccurate depth measure-ments from various depth sensors. Previous polarization-based depth enhancement methods focus on utilizing pure physics-based formulas for a single sensor. In contrast, our method first adopts a learning-based strategy where a neural network is trained to estimate a dense and complete depth map from polarization data and a sensor depth map from different sensors. To further improve the performance, we propose a Polarization Prompt Fusion Tuning (PPFT) strategy to effectively utilize RGB-based models pre-trained on large- scale datasets, as the size of the polarization dataset is limited to train a strong model from scratch. We conducted extensive experiments on a public dataset, and the results demonstrate that the proposed method performs favorably compared to existing depth enhancement baselines. Code and demos are available at https://lastbasket.github.io/PPFT/.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE) , 2024. p. 20710-20720
Series
IEEE Conference on Computer Vision and Pattern Recognition, ISSN 1063-6919
National Category
Signal Processing
Identifiers
URN: urn:nbn:se:kth:diva-359801DOI: 10.1109/CVPR52733.2024.01957ISI: 001342515504007OAI: oai:DiVA.org:kth-359801DiVA, id: diva2:1937081
Conference
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 16-22, 2024, Seattle, WA, USA
Note

Part of ISBN 979-8-3503-5300-6

QC 20250212

Available from: 2025-02-12 Created: 2025-02-12 Last updated: 2025-02-12Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Authority records

Ikemura, Kei

Search in DiVA

By author/editor
Ikemura, Kei
By organisation
Robotics, Perception and Learning, RPL
Signal Processing

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 22 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf