kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
ZoDi: Zero-Shot Domain Adaptation with Diffusion-Based Image Transfer
KTH. Univ Tokyo, Bunkyo, Japan.
Univ Tokyo, Bunkyo, Japan.
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0002-4266-6746
2025 (English)In: Computer Vision-Eccv 2024 Workshops, Pt Xviii / [ed] DelBue, A Canton, C Pont-Tuset, J Tommasi, T, Springer Nature , 2025, Vol. 15640, p. 151-167Conference paper, Published paper (Refereed)
Abstract [en]

Deep learning models achieve high accuracy in segmentation tasks among others, yet domain shift often degrades the models' performance, which can be critical in real-world scenarios where no target images are available. This paper proposes a zero-shot domain adaptation method based on diffusion models, called ZoDi, which is two-fold by the design: zero-shot image transfer and model adaptation. First, we utilize an off-the-shelf diffusion model to synthesize target-like images by transferring the domain of source images to the target domain. In this we specifically try to maintain the layout and content by utilising layout-to-image diffusion models with stochastic inversion. Secondly, we train the model using both source images and synthesized images with the original segmentation maps while maximizing the feature similarity of images from the two domains to learn domain-robust representations. Through experiments we show benefits of ZoDi in the task of image segmentation over state-of-the-art methods. It is also more applicable than existing CLIP-based methods because it assumes no specific backbone or models, and it enables to estimate the model's performance without target images by inspecting generated images. Our implementation will be publicly available at https://github.com/azuma164/ZoDi.

Place, publisher, year, edition, pages
Springer Nature , 2025. Vol. 15640, p. 151-167
Series
Lecture Notes in Computer Science, ISSN 0302-9743
Keywords [en]
Zero-Shot Domain Adaptation, Diffusion Models, Segmentation
National Category
Computer graphics and computer vision
Identifiers
URN: urn:nbn:se:kth:diva-374117DOI: 10.1007/978-3-031-91672-4_10ISI: 001544992100010Scopus ID: 2-s2.0-105006822426OAI: oai:DiVA.org:kth-374117DiVA, id: diva2:2022003
Conference
18th European Conference on Computer Vision (ECCV), SEP 29-OCT 04, 2024, Milan, Italy
Note

Part of ISBN 978-3-031-91671-7; 978-3-031-91672-4

QC 20251216

Available from: 2025-12-16 Created: 2025-12-16 Last updated: 2025-12-16Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Maki, Atsuto

Search in DiVA

By author/editor
Azuma, HirokiMaki, Atsuto
By organisation
KTHRobotics, Perception and Learning, RPL
Computer graphics and computer vision

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 79 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf