kth.sePublikationer KTH
Ändra sökning
Länk till posten
Permanent länk

Direktlänk
Publikationer (3 of 3) Visa alla publikationer
Bengtson, J., Nilsson, D., Lin, C. T., Büsching, M. & Kahl, F. (2025). Adjustable Visual Appearance for Generalizable Novel View Synthesis. In: Pattern Recognition and Artificial Intelligence - 4th International Conference, ICPRAI 2024, Proceedings: . Paper presented at 4th International Conference on Pattern Recognition and Artificial Intelligence, ICPRAI 2024, Jeju Island, Korea, Jul 3 2024 - Jul 6 2024 (pp. 157-171). Springer Nature
Öppna denna publikation i ny flik eller fönster >>Adjustable Visual Appearance for Generalizable Novel View Synthesis
Visa övriga...
2025 (Engelska)Ingår i: Pattern Recognition and Artificial Intelligence - 4th International Conference, ICPRAI 2024, Proceedings, Springer Nature , 2025, s. 157-171Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

We present a generalizable novel view synthesis method which enables modifying the visual appearance of an observed scene so rendered views match a target weather or lighting condition, without any scene specific training or access to reference views at the target condition. Our method is based on a pretrained generalizable transformer architecture and is fine-tuned on synthetically generated scenes under different appearance conditions. This allows for rendering novel views in a consistent manner for 3D scenes that were not included in the training set, along with the ability to (i) modify their appearance to match the target condition and (ii) smoothly interpolate between different conditions. Experiments on real and synthetic scenes show that our method is able to generate 3D consistent renderings while making realistic appearance changes, including qualitative and quantitative comparisons. Please refer to our project page for video results: https://ava-nvs.github.io.

Ort, förlag, år, upplaga, sidor
Springer Nature, 2025
Nyckelord
3D Style Transfer, Generalizable Novel View Synthesis, NeRFs
Nationell ämneskategori
Datorgrafik och datorseende Signalbehandling
Identifikatorer
urn:nbn:se:kth:diva-361150 (URN)10.1007/978-981-97-8702-9_11 (DOI)001584478300011 ()2-s2.0-85219213349 (Scopus ID)
Konferens
4th International Conference on Pattern Recognition and Artificial Intelligence, ICPRAI 2024, Jeju Island, Korea, Jul 3 2024 - Jul 6 2024
Anmärkning

Part of ISBN 9789819787012

QC 20250313

Tillgänglig från: 2025-03-12 Skapad: 2025-03-12 Senast uppdaterad: 2026-05-29Bibliografiskt granskad
Longhini, A., Büsching, M., Duisterhof, B. P., Lundell, J., Ichnowski, J., Björkman, M. & Kragic, D. (2024). Cloth-Splatting: 3D Cloth State Estimation from RGB Supervision. In: Proceedings of the 8th Conference on Robot Learning, CoRL 2024: . Paper presented at 8th Annual Conference on Robot Learning, November 6-9, 2024, Munich, Germany (pp. 2845-2865). ML Research Press
Öppna denna publikation i ny flik eller fönster >>Cloth-Splatting: 3D Cloth State Estimation from RGB Supervision
Visa övriga...
2024 (Engelska)Ingår i: Proceedings of the 8th Conference on Robot Learning, CoRL 2024, ML Research Press , 2024, s. 2845-2865Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

We introduce Cloth-Splatting, a method for estimating 3D states of cloth from RGB images through a prediction-update framework. Cloth-Splatting leverages an action-conditioned dynamics model for predicting future states and uses 3D Gaussian Splatting to update the predicted states. Our key insight is that coupling a 3D mesh-based representation with Gaussian Splatting allows us to define a differentiable map between the cloth's state space and the image space. This enables the use of gradient-based optimization techniques to refine inaccurate state estimates using only RGB supervision. Our experiments demonstrate that Cloth-Splatting not only improves state estimation accuracy over current baselines but also reduces convergence time by ∼85 %.

Ort, förlag, år, upplaga, sidor
ML Research Press, 2024
Nyckelord
3D State Estimation, Gaussian Splatting, Vision-based Tracking, Deformable Objects
Nationell ämneskategori
Datorgrafik och datorseende
Identifikatorer
urn:nbn:se:kth:diva-357192 (URN)2-s2.0-86000735293 (Scopus ID)
Konferens
8th Annual Conference on Robot Learning, November 6-9, 2024, Munich, Germany
Anmärkning

QC 20250328

Tillgänglig från: 2024-12-04 Skapad: 2024-12-04 Senast uppdaterad: 2025-03-28Bibliografiskt granskad
Büsching, M., Bengtson, J., Nilsson, D. & Björkman, M. (2024). FlowIBR: Leveraging Pre-Training for Efficient Neural Image-Based Rendering of Dynamic Scenes. In: Proceedings - 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024: . Paper presented at 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024, Seattle, United States of America, Jun 16 2024 - Jun 22 2024 (pp. 8016-8026). Institute of Electrical and Electronics Engineers (IEEE)
Öppna denna publikation i ny flik eller fönster >>FlowIBR: Leveraging Pre-Training for Efficient Neural Image-Based Rendering of Dynamic Scenes
2024 (Engelska)Ingår i: Proceedings - 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024, s. 8016-8026Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

We introduce FlowIBR, a novel approach for efficient monocular novel view synthesis of dynamic scenes. Existing techniques already show impressive rendering quality but tend to focus on optimization within a single scene without leveraging prior knowledge, resulting in long optimization times per scene. FlowIBR circumvents this limitation by integrating a neural image-based rendering method, pretrained on a large corpus of widely available static scenes, with a per-scene optimized scene flow field. Utilizing this flow field, we bend the camera rays to counteract the scene dynamics, thereby presenting the dynamic scene as if it were static to the rendering network. The proposed method reduces per-scene optimization time by an order of magnitude, achieving comparable rendering quality to existing methods - all on a single consumer-grade GPU.

Ort, förlag, år, upplaga, sidor
Institute of Electrical and Electronics Engineers (IEEE), 2024
Nyckelord
3D from multi-view and sensors, Dynamic scenes, Neural rendering
Nationell ämneskategori
Datavetenskap (datalogi) Datorgrafik och datorseende
Identifikatorer
urn:nbn:se:kth:diva-367269 (URN)10.1109/CVPRW63382.2024.00800 (DOI)001327781708021 ()2-s2.0-85206485314 (Scopus ID)
Konferens
2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024, Seattle, United States of America, Jun 16 2024 - Jun 22 2024
Anmärkning

Part of ISBN 9798350365474

QC 20250717

Tillgänglig från: 2025-07-17 Skapad: 2025-07-17 Senast uppdaterad: 2025-07-17Bibliografiskt granskad
Organisationer
Identifikatorer
ORCID-id: ORCID iD iconorcid.org/0000-0001-9296-9166

Sök vidare i DiVA

Visa alla publikationer