kth.sePublications KTH
Change search
Link to record
Permanent link

Direct link
Publications (3 of 3) Show all publications
Bengtson, J., Nilsson, D., Lin, C. T., Büsching, M. & Kahl, F. (2025). Adjustable Visual Appearance for Generalizable Novel View Synthesis. In: Pattern Recognition and Artificial Intelligence - 4th International Conference, ICPRAI 2024, Proceedings: . Paper presented at 4th International Conference on Pattern Recognition and Artificial Intelligence, ICPRAI 2024, Jeju Island, Korea, Jul 3 2024 - Jul 6 2024 (pp. 157-171). Springer Nature
Open this publication in new window or tab >>Adjustable Visual Appearance for Generalizable Novel View Synthesis
Show others...
2025 (English)In: Pattern Recognition and Artificial Intelligence - 4th International Conference, ICPRAI 2024, Proceedings, Springer Nature , 2025, p. 157-171Conference paper, Published paper (Refereed)
Abstract [en]

We present a generalizable novel view synthesis method which enables modifying the visual appearance of an observed scene so rendered views match a target weather or lighting condition, without any scene specific training or access to reference views at the target condition. Our method is based on a pretrained generalizable transformer architecture and is fine-tuned on synthetically generated scenes under different appearance conditions. This allows for rendering novel views in a consistent manner for 3D scenes that were not included in the training set, along with the ability to (i) modify their appearance to match the target condition and (ii) smoothly interpolate between different conditions. Experiments on real and synthetic scenes show that our method is able to generate 3D consistent renderings while making realistic appearance changes, including qualitative and quantitative comparisons. Please refer to our project page for video results: https://ava-nvs.github.io.

Place, publisher, year, edition, pages
Springer Nature, 2025
Keywords
3D Style Transfer, Generalizable Novel View Synthesis, NeRFs
National Category
Computer graphics and computer vision Signal Processing
Identifiers
urn:nbn:se:kth:diva-361150 (URN)10.1007/978-981-97-8702-9_11 (DOI)2-s2.0-85219213349 (Scopus ID)
Conference
4th International Conference on Pattern Recognition and Artificial Intelligence, ICPRAI 2024, Jeju Island, Korea, Jul 3 2024 - Jul 6 2024
Note

Part of ISBN 9789819787012

QC 20250313

Available from: 2025-03-12 Created: 2025-03-12 Last updated: 2025-03-13Bibliographically approved
Longhini, A., Büsching, M., Duisterhof, B. P., Lundell, J., Ichnowski, J., Björkman, M. & Kragic, D. (2024). Cloth-Splatting: 3D Cloth State Estimation from RGB Supervision. In: Proceedings of the 8th Conference on Robot Learning, CoRL 2024: . Paper presented at 8th Annual Conference on Robot Learning, November 6-9, 2024, Munich, Germany (pp. 2845-2865). ML Research Press
Open this publication in new window or tab >>Cloth-Splatting: 3D Cloth State Estimation from RGB Supervision
Show others...
2024 (English)In: Proceedings of the 8th Conference on Robot Learning, CoRL 2024, ML Research Press , 2024, p. 2845-2865Conference paper, Published paper (Refereed)
Abstract [en]

We introduce Cloth-Splatting, a method for estimating 3D states of cloth from RGB images through a prediction-update framework. Cloth-Splatting leverages an action-conditioned dynamics model for predicting future states and uses 3D Gaussian Splatting to update the predicted states. Our key insight is that coupling a 3D mesh-based representation with Gaussian Splatting allows us to define a differentiable map between the cloth's state space and the image space. This enables the use of gradient-based optimization techniques to refine inaccurate state estimates using only RGB supervision. Our experiments demonstrate that Cloth-Splatting not only improves state estimation accuracy over current baselines but also reduces convergence time by ∼85 %.

Place, publisher, year, edition, pages
ML Research Press, 2024
Keywords
3D State Estimation, Gaussian Splatting, Vision-based Tracking, Deformable Objects
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:kth:diva-357192 (URN)2-s2.0-86000735293 (Scopus ID)
Conference
8th Annual Conference on Robot Learning, November 6-9, 2024, Munich, Germany
Note

QC 20250328

Available from: 2024-12-04 Created: 2024-12-04 Last updated: 2025-03-28Bibliographically approved
Büsching, M., Bengtson, J., Nilsson, D. & Björkman, M. (2024). FlowIBR: Leveraging Pre-Training for Efficient Neural Image-Based Rendering of Dynamic Scenes. In: Proceedings - 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024: . Paper presented at 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024, Seattle, United States of America, Jun 16 2024 - Jun 22 2024 (pp. 8016-8026). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>FlowIBR: Leveraging Pre-Training for Efficient Neural Image-Based Rendering of Dynamic Scenes
2024 (English)In: Proceedings - 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024, p. 8016-8026Conference paper, Published paper (Refereed)
Abstract [en]

We introduce FlowIBR, a novel approach for efficient monocular novel view synthesis of dynamic scenes. Existing techniques already show impressive rendering quality but tend to focus on optimization within a single scene without leveraging prior knowledge, resulting in long optimization times per scene. FlowIBR circumvents this limitation by integrating a neural image-based rendering method, pretrained on a large corpus of widely available static scenes, with a per-scene optimized scene flow field. Utilizing this flow field, we bend the camera rays to counteract the scene dynamics, thereby presenting the dynamic scene as if it were static to the rendering network. The proposed method reduces per-scene optimization time by an order of magnitude, achieving comparable rendering quality to existing methods - all on a single consumer-grade GPU.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
3D from multi-view and sensors, Dynamic scenes, Neural rendering
National Category
Computer Sciences Computer graphics and computer vision
Identifiers
urn:nbn:se:kth:diva-367269 (URN)10.1109/CVPRW63382.2024.00800 (DOI)001327781708021 ()2-s2.0-85206485314 (Scopus ID)
Conference
2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024, Seattle, United States of America, Jun 16 2024 - Jun 22 2024
Note

Part of ISBN 9798350365474

QC 20250717

Available from: 2025-07-17 Created: 2025-07-17 Last updated: 2025-07-17Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0001-9296-9166

Search in DiVA

Show all publications