kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
CrossEarth: Geospatial Vision Foundation Model for Domain Generalizable Remote Sensing Semantic Segmentation
Shanghai Jiao Tong University, School of Computer Science, China; Sun Yat-sen University, School of Atmospheric Sciences, China.
Chinese Academy of Sciences, Key Laboratory of Cryospheric Science and Frozen Soil Engineering, Northwest Institute of Eco-Environment and Resources, China; University of Science and Technology of China, School of Engineering Science, China.
Wuhan University, School of Computer Science, China; Zhongguancun Academy, China.
Beijing Institute of Technology, School of Information and Electronics, China.
Show others and affiliations
2026 (English)In: IEEE Transactions on Pattern Analysis and Machine Intelligence, ISSN 0162-8828, E-ISSN 1939-3539Article in journal (Refereed) Epub ahead of print
Abstract [en]

Due to the substantial domain gaps in Remote Sensing (RS) images that are characterized by variabilities such as location, wavelength, and sensor type, Remote Sensing Domain Generalization (RSDG) has emerged as a critical and valuable research frontier, focusing on developing models that generalize effectively across diverse scenarios. However, research in this area remains underexplored: (1) Current cross-domain methods primarily focus on Domain Adaptation (DA), which adapts models to predefined domains rather than to unseen ones; (2) Few studies target the RSDG issue, especially for semantic segmentation tasks. Existing related models are developed for specific unknown domains, struggling with issues of underfitting on other unseen scenarios; (3) Existing RS foundation models tend to prioritize in-domain performance over cross-domain generalization. To this end, we introduce the first vision foundation model for RSDG semantic segmentation, CrossEarth. CrossEarth demonstrates strong cross-domain generalization through a specially designed data-level Earth-Style Injection pipeline and a model-level Multi-Task Training pipeline. In addition, for the semantic segmentation task, we have curated an RSDG benchmark comprising 32 semantic segmentation scenarios across various regions, spectral bands, platforms, and climates, providing comprehensive evaluations of the generalizability of future RSDG models. Extensive experiments on this collection demonstrate the superiority of CrossEarth over existing state-of-the-art methods.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE) , 2026.
Keywords [en]
Domain generalization, masked image modeling, remote sensing, semantic segmentation, vision foundation model
National Category
Computer Sciences Computer graphics and computer vision
Identifiers
URN: urn:nbn:se:kth:diva-375757DOI: 10.1109/TPAMI.2025.3649001PubMedID: 41460898Scopus ID: 2-s2.0-105026375219OAI: oai:DiVA.org:kth-375757DiVA, id: diva2:2030890
Note

QC 20260121

Available from: 2026-01-21 Created: 2026-01-21 Last updated: 2026-01-21Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMedScopus

Authority records

Jia, Yuru

Search in DiVA

By author/editor
Jia, Yuru
By organisation
Geoinformatics
In the same journal
IEEE Transactions on Pattern Analysis and Machine Intelligence
Computer SciencesComputer graphics and computer vision

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 10 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf