kth.sePublications
Change search
Link to record
Permanent link

Direct link
Publications (10 of 13) Show all publications
Yadav, R. & Nascetti, A. (2025). A Multi-Modal, Multi-Temporal, Multi-Resolution Benchmark Dataset for Building Height Estimation.
Open this publication in new window or tab >>A Multi-Modal, Multi-Temporal, Multi-Resolution Benchmark Dataset for Building Height Estimation
2025 (English)In: Article in journal (Other academic) Accepted
National Category
Earth Observation Computer Sciences
Identifiers
urn:nbn:se:kth:diva-371708 (URN)
Note

Accepted by Scientific Data (Nature Publishing Group) ISSN  2052-4463

QC 20251020

Available from: 2025-10-16 Created: 2025-10-16 Last updated: 2025-10-20Bibliographically approved
Yadav, R., Nascetti, A. & Ban, Y. (2025). How high are we? Large-scale building height estimation at 10 m using Sentinel-1 SAR and Sentinel-2 MSI time series. Remote Sensing of Environment, 318, Article ID 114556.
Open this publication in new window or tab >>How high are we? Large-scale building height estimation at 10 m using Sentinel-1 SAR and Sentinel-2 MSI time series
2025 (English)In: Remote Sensing of Environment, ISSN 0034-4257, E-ISSN 1879-0704, Vol. 318, article id 114556Article in journal (Refereed) Published
Abstract [en]

Accurate building height estimation is essential to support urbanization monitoring, environmental impact analysis and sustainable urban planning. However, conducting large-scale building height estimation remains a significant challenge. While deep learning (DL) has proven effective for large-scale mapping tasks, there is a lack of advanced DL models specifically tailored for height estimation, particularly when using open-source Earth observation data. In this study, we propose T-SwinUNet, an advanced DL model for large-scale building height estimation leveraging Sentinel-1 SAR and Sentinel-2 multispectral time series. T-SwinUNet model contains a feature extractor with local/global feature comprehension capabilities, a temporal attention module to learn the correlation between constant and variable features of building objects over time and an efficient multitask decoder to predict building height at 10 m spatial resolution. The model is trained and evaluated on data from the Netherlands, Switzerland, Estonia, and Germany, and its generalizability is evaluated on an out-of-distribution (OOD) test set from ten additional cities from other European countries. Our study incorporates extensive model evaluations, ablation experiments, and comparisons with established models. T-SwinUNet predicts building height with a Root Mean Square Error (RMSE) of 1.89 m, outperforming state-of-the-art models at 10 m spatial resolution. Its strong generalization to the OOD test set (RMSE of 3.2 m) underscores its potential for low-cost building height estimation across Europe, with future scalability to other regions. Furthermore, the assessment at 100 m resolution reveals that T-SwinUNet (0.29 m RMSE, 0.75 R2) also outperformed the global building height product GHSL-Built-H R2023A product(0.56 m RMSE and 0.37 R2). Our implementation is available at: https://github.com/RituYadav92/Building-Height-Estimation.

Place, publisher, year, edition, pages
Elsevier BV, 2025
Keywords
Building height estimation, Multitask learning, Out-of-distribution generalization, Regression, Sentinel, Time series
National Category
Earth Observation
Identifiers
urn:nbn:se:kth:diva-358166 (URN)10.1016/j.rse.2024.114556 (DOI)001413894800001 ()2-s2.0-85212150378 (Scopus ID)
Note

QC 20250217

Available from: 2025-01-07 Created: 2025-01-07 Last updated: 2025-10-16Bibliographically approved
Yadav, R. (2025). Multi-Modal Deep Learning for 2D/3D Mapping with Satellite Time Series images: From Floods to Forests to Cities. (Doctoral dissertation). Stockholm: KTH Royal Institute of Technology
Open this publication in new window or tab >>Multi-Modal Deep Learning for 2D/3D Mapping with Satellite Time Series images: From Floods to Forests to Cities
2025 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Driven by climate change and rapid urbanization, there is an urgent need for reliable large-scale Earth observation (EO) products that capture both two-dimensional (2D) and three-dimensional (3D) characteristics of the Earth’s surface. Modern satellite missions, particularly Sentinel-1 Synthetic Aperture Radar (SAR) and Sentinel-2 MultiSpectral Instrument (MSI), provide freely accessible global-scale imagery with frequent revisits, offering new opportunities for large-scale mapping such as floods, urban growth, and forest dynamics. Concurrently, deep learning (DL) has become state-of-the-art for EO analysis. However, challenges remain in ensuring generalization across regions, reducing reliance on labeled data, extracting 3D features from mid-resolution imagery, and enhancing reliability through uncertainty estimation. This thesis addresses these challenges by proposing novel DL models for 2D and 3D applications, improving model generalizability, curating benchmark datasets, and integrating uncertainty estimation into EO tasks.

For 2D mapping, this thesis focuses on flood mapping as the primary application. Two supervised segmentation networks were developed for the task. The first, Attentive U-Net, enhances Sentinel-1 VV, VH, and VV/VH ratio inputs using spatial and channel-wise self-attention. The second, a dual-stream Fusion Network, integrates Sentinel-1 data with DEM and permanent water masks for improved contextual learning. Both outperformed supervised baselines on the Sen1Floods11 dataset, achieving 3–5% higher IoU. To further improve model generalizability and reduce dependency on labels, an unsupervised model (CLVAE) was developed that learns spatiotemporal features from Sentinel-1 SAR time series using reconstruction and contrastive learning. Flood maps are derived by detecting changes in latent feature distributions of pre and post-flood time series images. CLVAE achieved 70% IoU, surpassing unsupervised baselines by a minimum margin of 15% IoU and outperforming supervised models when tested on unseen flood sites, showing a higher model generalizability.

For 3D mapping, multiple advances were made. A hybrid CNN-transformer architecture (T-SwinUNet) was proposed for large-scale building height estimation from 12-month Sentinel-1 and Sentinel-2 time series. Leveraging multi-modal spatio-temporal features and multitask learning, it achieved 1.89 m RMSE at 10 m resolution and generalized across diverse European cities. The model outperformed existing global height product GHSL-Built-H.To further improve building height estimation accuracy, the M4Heights benchmark dataset was released, covering sites in Estonia, Switzerland, and the Netherlands. Combining 10 m Sentinel-1&2 time series with 1 m aerial orthophotos enables multi-scale and multitask learning for super-resolution building height estimation. Baseline evaluations confirmed its benefits, and the open dataset supports fair model comparisons and encourages further innovation in the field.Extending 3D mapping from the built environment to natural ecosystems, the BioMassters benchmark dataset for above-ground forest biomass estimation was curated and released. It covers 8.5 million hectares of Finnish forests, with labels derived from high-resolution LiDAR data and inputs from Sentinel-1&2 time series. Released alongside a global challenge with over 1000 model submissions, the results demonstrated the superiority of DL methods over the coarse 100 m ESA CCI Biomass product, enabling biomass mapping at 10 m resolution and underscoring the importance of open, DL-ready datasets.

The thesis further advances 3D mapping by integrating uncertainty quantification into large-scale regression tasks for building height, canopy height, and biomass estimation at 10 m resolution. Two uncertainty quantification approaches were investigated through: (i) a Gaussian uncertainty model, which assumes symmetric error distributions, and (ii) a Quantile uncertainty model, which provides asymmetric intervals and captures the direction of uncertainty. Both methods achieved accuracy comparable to deterministic baselines while additionally providing calibrated confidence intervals. Importantly, they outperformed existing global canopy and biomass products that include uncertainty information. The Gaussian model performed best for canopy height and biomass, while the quantile model proved more robust for building height, where data follow strictly non-Gaussian and skewed distributions. Together, these advances establish uncertainty-aware regression as a critical step toward making EO-derived 3D products more trustworthy for real-world applications.

In conclusion, this thesis addresses key challenges in large-scale 2D and 3D EO tasks, spanning flood detection, building height estimation, biomass estimation, and canopy height estimation. By advancing DL models that leverage time series of Sentinel-1&2 imagery, integrating uncertainty quantification into the model and releasing benchmark datasets, this thesis makes major contributions to scalable, reliable and reproducible EO data products. These advances enhance the trustworthiness of EO-derived products for real-world applications, supporting sustainable urban planning, climate resilience, and the monitoring of Sustainable Development Goals.

Abstract [sv]

Drivna av klimatförändringar och snabb urbanisering finns ett akut behovav tillförlitliga jordobservationsprodukter (EO) i stor skala som fångarbåde tvådimensionella (2D) och tredimensionella (3D) egenskaper hos jordensyta. Moderna satellitmissioner, särskilt Sentinel-1 syntetiska aperturradar(SAR) och Sentinel-2:s MultiSpectral Instrument (MSI), tillhandahållerfritt tillgänglig bilddata i global skala med frekventa återbesök, vilket erbjudernya möjligheter för storskalig kartläggning såsom översvämningar, urbantillväxt och skogsdynamik. Samtidigt har djupinlärning (DL) blivit det ledandetillvägagångssättet för EO-analys. Dock kvarstår utmaningar med attsäkerställa generalisering över olika regioner, minska beroendet av märkta data,utvinna 3D information från bilddata med medelhög upplösning samt ökatillförlitligheten genom osäkerhetsuppskattning. Denna avhandling adresserardessa utmaningar genom att föreslå nya DL-modeller för 2D och 3D applikationer,förbättra modellernas generaliserbarhet, kurera referensdataset samtintegrera osäkerhetsuppskattning i EO uppgifter.

För 2D-kartläggning fokuserar avhandlingen på översvämningskartläggningsom huvudapplikation. Två övervakade segmenteringsnätverk utveckladesför uppgiften. Det första, Attentive U-Net, utnyttjar Sentinel-1 inmatningar(VV, VH samt VV/VH kvot) och förstärker dem med rumslig ochkanalvis självuppmärksamhet. Det andra, ett tvåströms-fusionsnätverk, integrerarSentinel-1-data med digital höjdmodell (DEM) och permanenta vattenmaskerför förbättrad kontextuell inlärning. Båda överträffade övervakadebaslinjemodeller på Sen1Floods11-datasetet och uppnådde 3-5% högre IoU.För att ytterligare förbättra modellernas generaliserbarhet och minska beroendetav märkta data utvecklades en osuperviserad modell (CLVAE) somlär sig spatiotemporala egenskaper från Sentinel-1 SAR tidsserier via rekonstruktionoch kontrastiv inlärning. Översvämningskartor härleds genom attupptäcka förändringar i latenta representationsfördelningar mellan före ochefteröversvämnings-tidsserier. CLVAE uppnådde 70% IoU, överträffade osuperviseradebaslinjer med minst 15% IoU och presterade bättre än övervakademodeller vid test på tidigare osedda översvämningsområden, vilket visar påhögre modellgeneraliserbarhet.

För 3D-kartläggning gjordes flera framsteg. En hybridarkitektur med CNNoch transformer (T-SwinUNet) föreslogs för storskalig skattning av byggnadshöjderfrån 12 månaders Sentinel-1 och Sentinel-2 tidsserier. Genom att utnyttjamultimodala spatiotemporala egenskaper och multitask-inlärning uppnåddesett RMSE på 1.89m vid 10m upplösning och modellen generaliseradeväl över olika europeiska städer. Den överträffade den befintliga globalabyggnadshöjdsprodukten GHSL-Built-H. För att ytterligare förbättra noggrannheteni byggnadshöjdsskattning släpptes referensdatasetet M4Heights,som täcker områden i Estland, Schweiz och Nederländerna. Kombinationenav 10m Sentinel-1&2 tidsserier med 1m flygfotobaserade ortofoton möjliggörmultiskalig och multitask-inlärning för superupplöst byggnadshöjdsskattning.Baslinjeutvärderingar bekräftade dess fördelar, och det öppna datasetet stödjerrättvisa modelljämförelser och uppmuntrar vidare innovation inom området.

Genom att utvidga 3D-kartläggning från den byggda miljön till naturligaekosystem kuraterades och släpptes referensdatasetet BioMassters för skattningav biomassa ovan mark i skogar. Den täcker 8.5 miljoner hektar finskaskogar, med etiketter härledda från högupplöst LiDAR-data och indata frånSentinel-1&2-tidsserier. Datasetet släpptes tillsammans med en global tävlingmed över 1000 modellbidrag. Resultaten visade på DL-metodernas överlägsenhetjämfört med den grova 100m ESA CCI Biomass produkten, vilketmöjliggör biomassakartläggning vid 10m upplösning och understryker viktenav öppna, djupinlärningsklara dataset.

Avhandlingen för 3D kartläggning går vidare genom att integrera osäkerhetskvantifieringi storskaliga regressionsuppgifter för byggnadshöjd, trädhöjdoch biomassa vid 10m upplösning. Två metoder för osäkerhetskvantifieringundersöktes: (i) en gaussisk osäkerhetsmodell, som antar symmetriska fel, och(ii) en kvantilmodell, som ger asymmetriska intervall och fångar riktningen påosäkerheten. Båda metoderna uppnådde noggrannhet jämförbar med deterministiskamodeller samtidigt som de tillhandahöll kalibrerade konfidensintervall.Viktigt är att de presterade bättre än befintliga globala produkter förträdhöjd och biomassa som inkluderar osäkerhetsinformation. Den gaussiskamodellen presterade bäst för trädhöjd och biomassa, medan kvantilmodellenvisade sig mer robust för byggnadshöjd, där data följer icke gaussiskaoch snedfördelade mönster. Tillsammans etablerar dessa framsteg osäkerhetsmedvetenregression som ett avgörande steg för att göra EO-härledda 3Dproduktermer tillförlitliga för verkliga applikationer.

Sammanfattningsvis adresserar denna avhandling centrala utmaningar inomstorskaliga 2D och 3D EO uppgifter, från översvämningsdetektion tillskattning av byggnadshöjd, biomassa och trädhöjd. Genom att utveckla DLmodellersom utnyttjar tidsserier av Sentinel-1&2, integrera osäkerhetskvantifieringi modellerna och släppa referensdataset bidrar avhandlingen medskalbara, tillförlitliga och reproducerbara EO-dataprodukter. Dessa framstegökar tilliten till EO-härledda produkter i praktiska tillämpningar och stödjerhållbar stadsplanering, klimatanpassning samt uppföljning av de Globalamålen för hållbar utveckling (SDG:erna).

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2025. p. 106
Series
TRITA-ABE-DLT ; 2540
Keywords
2D mapping, 3D mapping, Floods, Building Height, Biomass, Canopy Height, Uncertainty Estimation, Segmentation, Change Detection, Regression, Gaussian, Quantile, Unsupervised Learning, Contrastive Learning, Multi-task Learning, Self-Attention, Convolutional LSTM, VAE, CNN, transformer, SWIN, Remote Sensing, Sentinel-1 SAR, Sentinel-2 MSI, Aerial Orthophotos, DEM, Data Fusion, Time Series, Deep Learning, Generalization, OOD
National Category
Computer Sciences Earth Observation
Research subject
Geodesy and Geoinformatics, Geoinformatics
Identifiers
urn:nbn:se:kth:diva-371709 (URN)978-91-8106-444-5 (ISBN)
Public defence
2025-11-04, Kollegiesalen, Brinellvägen 8, KTH Campus, public video conference link https://kth-se.zoom.us/j/68698558153, Stockholm, 14:00 (English)
Opponent
Supervisors
Projects
AI4EO, Digital Future
Note

QC 20251017

Available from: 2025-10-17 Created: 2025-10-16 Last updated: 2025-11-03Bibliographically approved
Yadav, R., Nascetti, A., Azizpour, H. & Ban, Y. (2024). Unsupervised Flood Detection on SAR Time Series using Variational Autoencoder. International Journal of Applied Earth Observation and Geoinformation, 126, Article ID 103635.
Open this publication in new window or tab >>Unsupervised Flood Detection on SAR Time Series using Variational Autoencoder
2024 (English)In: International Journal of Applied Earth Observation and Geoinformation, ISSN 1569-8432, E-ISSN 1872-826X, Vol. 126, article id 103635Article in journal (Other academic) Published
Abstract [en]

In this study, we propose a novel unsupervised Change Detection (CD) model to detect flood extent using Synthetic Aperture Radar~(SAR) time series data. The proposed model is based on a spatiotemporal variational autoencoder, trained with reconstruction, and contrastive learning techniques. The change maps are generated with a proposed novel algorithm that utilizes differences in latent feature distributions between pre-flood and post-flood data. The model is evaluated on nine different flood events by comparing the results with reference flood maps collected from the Copernicus Emergency Management Services (CEMS) and Sen1Floods11 dataset. We conducted a range of experiments and ablation studies to investigate the performance of our model. We compared the results with existing unsupervised models. The model achieved an average of 70\% Intersection over Union (IoU) score which is at least 7\% better than the IoU from existing unsupervised CD models. In the generalizability test, the proposed model outperformed supervised models ADS-Net (by 10\% IoU) and DAUSAR (by 8\% IoU), both trained on Sen1Floods11 and tested on CEMS sites.

Place, publisher, year, edition, pages
Elsevier BV, 2024
National Category
Earth Observation
Identifiers
urn:nbn:se:kth:diva-338773 (URN)10.1016/j.jag.2023.103635 (DOI)001143611500001 ()2-s2.0-85181026128 (Scopus ID)
Note

QC 20251029

Available from: 2023-10-25 Created: 2023-10-25 Last updated: 2025-10-29Bibliographically approved
Yadav, R., Nascetti, A. & Ban, Y. (2023). A CNN regression model to estimate buildings height maps using Sentinel-1 SAR and Sentinel-2 MSI time series. In: Proceedings IGARSS 2023 - 2023 IEEE International Geoscience and Remote Sensing Symposium: . Paper presented at IGARSS 2023 - 2023 IEEE International Geoscience and Remote Sensing Symposium, 16-21 July 2023, Pasadena CA, USA. Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>A CNN regression model to estimate buildings height maps using Sentinel-1 SAR and Sentinel-2 MSI time series
2023 (English)In: Proceedings IGARSS 2023 - 2023 IEEE International Geoscience and Remote Sensing Symposium, Institute of Electrical and Electronics Engineers (IEEE) , 2023Conference paper, Published paper (Refereed)
Abstract [en]

Accurate estimation of building heights is essential for urban planning, infrastructure management, and environmental analysis. In this study, we propose a supervised Multimodal Building Height Regression Network (MBHR-Net) for estimating building heights at 10m spatial resolution using Sentinel-1 (S1) and Sentinel-2 (S2) satellite time series. S1 provides Synthetic Aperture Radar (SAR) data that offers valuable information on building structures, while S2 provides multispectral data that is sensitive to different land cover types, vegetation phenology, and building shadows. Our MBHR-Net aims to extract meaningful features from the S1 and S2 images to learn complex spatio-temporal relationships between image patterns and building heights. The model is trained and tested in 10 cities in the Netherlands. Root Mean Squared Error (RMSE), Intersection over Union (IOU), and R-squared (R2) score metrics are used to evaluate the performance of the model. The preliminary results (3.73m RMSE, 0.95 IoU, 0.61 R 2 ) demonstrate the effectiveness of our deep learning model in accurately estimating building heights, showcasing its potential for urban planning, environmental impact analysis, and other related applications.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2023
National Category
Earth Observation
Identifiers
urn:nbn:se:kth:diva-338771 (URN)10.1109/IGARSS52108.2023.10283039 (DOI)001098971603011 ()
Conference
IGARSS 2023 - 2023 IEEE International Geoscience and Remote Sensing Symposium, 16-21 July 2023, Pasadena CA, USA
Note

QC 20231025

Available from: 2023-10-25 Created: 2023-10-25 Last updated: 2025-02-10Bibliographically approved
Nascetti, A., Yadav, R., Brodt, K., Qu, Q., Fan, H., Shendryk, Y., . . . Chung, C. (2023). BioMassters: A Benchmark Dataset for Forest Biomass Estimation using Multi-modal Satellite Time Series. In: Advances in Neural Information Processing Systems 36 - 37th Conference on Neural Information Processing Systems, NeurIPS 2023: . Paper presented at 37th Conference on Neural Information Processing Systems, NeurIPS 2023, Dec 10-16 2023, New Orleans, United States of America,. Neural Information Processing Systems Foundation
Open this publication in new window or tab >>BioMassters: A Benchmark Dataset for Forest Biomass Estimation using Multi-modal Satellite Time Series
Show others...
2023 (English)In: Advances in Neural Information Processing Systems 36 - 37th Conference on Neural Information Processing Systems, NeurIPS 2023, Neural Information Processing Systems Foundation , 2023Conference paper, Published paper (Refereed)
Abstract [en]

Above Ground Biomass is an important variable as forests play a crucial role in mitigating climate change as they act as an efficient, natural and cost-effective carbon sink. Traditional field and airborne LiDAR measurements have been proven to provide reliable estimations of forest biomass. Nevertheless, the use of these techniques at a large scale can be challenging and expensive. Satellite data have been widely used as a valuable tool in estimating biomass on a global scale. However, the full potential of dense multi-modal satellite time series data, in combination with modern Deep Learning (DL) approaches, has yet to be fully explored. The aim of the "BioMassters" data challenge and benchmark dataset is to investigate the potential of multi-modal satellite data (Sentinel-1 SAR and Sentinel-2 MSI) to estimate forest biomass at a large scale using the Finnish Forest Centre's open forest and nature airborne LiDAR data as a reference. The performance of the top three baseline models shows the potential of DL to produce accurate and higher-resolution biomass maps. The dataset and the code are available on the project website: https://nascetti-a.github.io/BioMasster/.

Place, publisher, year, edition, pages
Neural Information Processing Systems Foundation, 2023
National Category
Forest Science Earth Observation
Identifiers
urn:nbn:se:kth:diva-346139 (URN)2-s2.0-85191176383 (Scopus ID)
Conference
37th Conference on Neural Information Processing Systems, NeurIPS 2023, Dec 10-16 2023, New Orleans, United States of America,
Note

QC 20240506

Available from: 2024-05-03 Created: 2024-05-03 Last updated: 2025-10-16Bibliographically approved
Yadav, R., Nascetti, A. & Ban, Y. (2023). Context-Aware Change Detection With Semi-Supervised Learning. In: Proceedings IGARSS 2023 - 2023 IEEE International Geoscience and Remote Sensing Symposium: . Paper presented at IGARSS 2023 - 2023 IEEE International Geoscience and Remote Sensing Symposium, Pasadena CA, USA, 16-21 July 2023. Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Context-Aware Change Detection With Semi-Supervised Learning
2023 (English)In: Proceedings IGARSS 2023 - 2023 IEEE International Geoscience and Remote Sensing Symposium, Institute of Electrical and Electronics Engineers (IEEE) , 2023Conference paper, Published paper (Refereed)
Abstract [en]

Change detection using earth observation data plays a vital role in quantifying the impact of disasters in affected areas. While data sources like Sentinel-2 provide rich optical information, they are often hindered by cloud cover, limiting their usage in disaster scenarios. However, leveraging pre-disaster optical data can offer valuable contextual information about the area such as landcover type, vegetation cover, soil types, enabling a better understanding of the disaster’s impact. In this study, we develop a model to assess the contribution of pre-disaster Sentinel-2 data in change detection tasks, focusing on disaster-affected areas. The proposed Context-Aware Change Detection Network (CACDN) utilizes a combination of pre-disaster Sentinel-2 data, pre and post-disaster Sentinel-1 data and ancillary Digital Elevation Models (DEMs) data. The model is validated on flood and landslide detection and evaluated using three metrics: Area Under the Precision-Recall Curve (AUPRC), Intersection over Union (IoU), and mean IoU. The preliminary results show significant improvement (4%, AUPRC, 3-7% IoU, 3-6% mean IoU) in model’s change detection capabilities when incorporated with pre-disaster optical data reflecting the effectiveness of using contextual information for accurate flood and landslide detection.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2023
National Category
Earth Observation
Identifiers
urn:nbn:se:kth:diva-338769 (URN)10.1109/IGARSS52108.2023.10281798 (DOI)001098971605224 ()2-s2.0-85178343469 (Scopus ID)
Conference
IGARSS 2023 - 2023 IEEE International Geoscience and Remote Sensing Symposium, Pasadena CA, USA, 16-21 July 2023
Note

Part of proceedings ISBN 979-8-3503-2010-7

QC 20231025

Available from: 2023-10-25 Created: 2023-10-25 Last updated: 2025-02-10Bibliographically approved
Yadav, R., Nascetti, A., Azizpour, H. & Ban, Y. (2023). Self-Supervised Contrastive Model for Flood Mapping and Monitoring on SAR Time-Series. In: : . Paper presented at EGU23 General Assembly, Vienna, Austria & Online, 23–28 April 2023. Copernicus GmbH
Open this publication in new window or tab >>Self-Supervised Contrastive Model for Flood Mapping and Monitoring on SAR Time-Series
2023 (English)Conference paper, Oral presentation with published abstract (Refereed)
Place, publisher, year, edition, pages
Copernicus GmbH, 2023
National Category
Earth Observation
Identifiers
urn:nbn:se:kth:diva-338772 (URN)10.5194/egusphere-egu23-14375 (DOI)
Conference
EGU23 General Assembly, Vienna, Austria & Online, 23–28 April 2023
Note

QC 20231025

Available from: 2023-10-25 Created: 2023-10-25 Last updated: 2025-02-10Bibliographically approved
Yadav, R. (2023). Supervised and Unsupervised Deep Learning Models for Flood Detection. (Licentiate dissertation). Stockholm: KTH Royal Institute of Technology
Open this publication in new window or tab >>Supervised and Unsupervised Deep Learning Models for Flood Detection
2023 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Human civilization has an increasingly powerful influence on the earthsystem. Affected by climate change and land-use change, floods are occurringacross the globe and are expected to increase in the coming years. Currentsituations urge more focus on efficient monitoring of floods and detecting impactedareas. Earth observations are an invaluable source for monitoring theEarth’s surface at a large scale. In particular, the Sentinel-1 Synthetic ApertureRadar (SAR) and Sentinel-2 MultiSpectral Instrument (MSI) missionsoffer high-resolution data with frequent global revisits that are widely usedfor flood detection.Current solutions such as Copernicus Emergency Management Services(CEMS), MODIS (Moderate Resolution Imaging Spectroradiometer) globalflood product, and many others use data from Sentinel and multiple othersatellites to detect floods. Although existing solutions are helpful, they alsohave several limitations. For instance, solutions like MODIS global floodproduct detect floods solely on optical images causing poor or no detection incloudy areas. In addition, these solutions are threshold-based and often requirecriteria-based adjustments. Furthermore, these solutions do not leveragerich spatial information between neighboring pixels and don’t use temporalfeatures of time series data. Therefore, advanced processing algorithms areneeded to provide a reliable method for flood detection.This thesis presents three Deep Learning (DL) models for flood detection.The first two models are supervised segmentation models proposed todetect floods on uni-temporal Sentinel-1 SAR data. The study sites containfloods from Bolivia, Ghana, India, Mekong, Nigeria, Pakistan, Paraguay,Somalia, Spain, Sri Lanka and USA. The third model is an unsupervised spatiotemporalchange detection (CD) model that detects floods on time series ofSentinel-1 SAR data. The study sites contain floods from Slovakia, Somalia,Spain, Bolivia, Mekong, Bosnia, Australia, Scotland and Germany.The two supervised segmentation models propose improving flood detectionwith the help of self-attention mechanism and fusion of Sentinel-1 SARwith more contextual information. The first network is ’Attentive U-Net’. Ittakes Sentinel-1 channels VV (vertical transmit, vertical receive), VH (verticaltransmit, horizontal receive), and the ratio VV/VH as input. The networkuses spatial and channel-wise self-attention to enhance feature maps resultingin better segmentation. The second network is a dual-stream attentive ’Fusionnetwork’, where the global low-resolution elevation data and permanent watermasks are fused with Sentinel-1 (VV, VH) data. The ’Attentive U-Net’ yields67.2% Intersection over Union (IoU), and the ’Fusion network’ gave 69.5%IoU on the Sen1Floods11 dataset. The performance gain is 3 to 5% IoUwith respect to the existing supervised models like FCNN (49.3% IoU score),U2Net (62% IoU score), and BASNet (64% IoU score). Quantitatively, thetwo proposed networks show significant improvement over benchmark methodsdemonstrating their potential. The qualitative analysis demonstrates thecontribution of low-resolution elevation and a permanent water mask in enhancingflood detection. Ablation experiments further clarify the effectiveness of ratio, self-attention, ratio and various design choices made in proposed networks.Furthermore, to improve across-region generalizability of the flood detectionmodel and to eliminate the dependency on labels, a novel unsupervisedCD model is presented that detects floods as changes on SAR time seriesdata. The proposed model is trained to learn spatiotemporal features of theSAR time series data with the help of unsupervised learning techniques, reconstruction,and contrastive learning. The change maps are generated witha novel algorithm that utilizes the learned latent feature distributions of preand post-flood data. The model achieved an average of 70% IoU score, outperformingexisting flood detection models like RaVAEn (45.03% IoU score),cGAN (51.49% IoU score) and SCCN (54.87% IoU score) with a significantminimum margin of 15% IoU score. The proposed model is tested for generalizabilityand outperformed supervised models ADS-Net and DAUSAR whentested on unseen CEMS flood sites. In addition, an automatic change monitoringand change point detection framework is proposed. The framework isbased on the proposed unsupervised CD model where time series data is processedthrough the model to identify percentage change at each time stampand the change point is detected by identifying the date on which significantchange started to reflect on SAR data. When integrated with high temporaldata i.e. daily images from ICEYE, the framework can help in continuousflood monitoring and early detection of slowly proceeding disaster events,giving more time for response.Overall, this thesis contributes supervised and unsupervised flood detectionmodels, enabling comprehensive and widely applicable flood mapping andmonitoring capabilities. These advancements facilitate near-real-time disasterresponse and resilient urban development, thus contributing to SDG 11 -Sustainable Cities and Communities.

Abstract [sv]

Den mänskliga civilisationen har ett allt starkare inflytande på jordsystemet. Påverkad av klimatförändringar och förändringar i markanvändningen sker översvämningar över hela världen och förväntas öka under de kommande åren. Nuvarande situationer kräver mer fokus på effektiv övervakning av översvämningar och upptäckt av drabbade områden. Jordobservationer är en ovärderlig källa för att övervaka jordens yta i stor skala. Särskilt Sentinel- 1 Synthetic Aperture Radar (SAR) och Sentinel-2 MultiSpectral Instrument (MSI)-uppdrag erbjuder högupplösta data med frekventa globala återbesök som används ofta för att detektera översvämningar.

Aktuella lösningar som Copernicus Emergency Management Services (CEMS), MODIS (Moderate Resolution Imaging Spectroradiometer) globala översvämningsprodukter och många andra använder data från Sentinel och flera andra satelliter för att upptäcka översvämningar. Även om befintliga lösningar är användbara, har de också flera begränsningar. Till exempel upptäcker lösningar som MODIS globala översvämningsprodukt översvämningar enbart på optiska bilder som orsakar dålig eller ingen detektering i molniga områden. Dessutom är dessa lösningar tröskelbaserade och kräver ofta kriteriebaserade justeringar. Dessutom utnyttjar dessa lösningar inte rik rumslig information mellan angränsande pixlar och använder inte tidsseriedata. Därför behövs avancerade bearbetningsalgoritmer för att tillhandahålla en tillförlitlig metod för översvämningsdetektering.

Denna avhandling presenterar tre modeller för djupinlärning (DL) för översvämningsdetektering. De två första modellerna är övervakade segmenteringsmodeller som föreslagits för att upptäcka översvämningar på uni-temporala Sentinel-1 SAR-data. Studieplatserna innehåller översvämningar från Bolivia, Ghana, Indien, Mekong, Nigeria, Pakistan, Paraguay, Somalia, Spanien, Sri Lanka och USA. Den tredje modellen är en oövervakad modell för upptäckt av spatiotemporal förändring (CD) som detekterar översvämningar på tidsserier av Sentinel-1 SAR-data. Studieplatserna innehåller översvämningar från Slovakien, Somalia, Spanien, Bolivia, Mekong, Bosnien, Australien, Skottland och Tyskland.

De två övervakade segmenteringsmodellerna föreslår förbättring av översvämningsdetektering med hjälp av självuppmärksamhetsmekanism och fusion av Sentinel-1 SAR med mer kontextuell information. Det första nätverket är ’Attentive U-Net’. Den tar Sentinel-1-kanalerna VV (vertikal sändning, vertikal mottagning), VH (vertikal sändning, horisontell mottagning) och förhållandet VV/VH som ingång. Nätverket använder rumslig och kanalvis självuppmärksamhet för att förbättra funktionskartor vilket resulterar i bättre segmentering. Det andra nätverket är ett dubbelströms uppmärksamt Fusion-nätverk", där globala lågupplösta höjddata och permanenta vattenmasker smälts samman med Sentinel-1-data (VV, VH). ’Attentive U-Net’ ger 67,2% Intersection Over Union (IoU), och ’Fusion-nätverket’ gav 69,5% IoU på Sen1Floods11-datauppsättningen. Prestandavinsten är 3 till 5% IoU med avseende på de befintliga övervakade modellerna som FCNN (49,3% IoUpoäng), U2Net (62% IoU-poäng) och BASNet (64% IoU) Göra). Kvantitativt visar de två föreslagna nätverken betydande förbättringar jämfört med vi benchmarkmetoder som visar deras potential. Den kvalitativa analysen visar bidraget från lågupplöst höjd och en permanent vattenmask för att förbättra översvämningsdetektering. Ablationsexperiment klargör ytterligare effektiviteten av ratio, självuppmärksamhet, ratio och olika designval som gjorts i föreslagna nätverk.

Dessutom, för att förbättra översvämningsdetekteringsmodellens generaliserbarhet över regioner och för att eliminera beroendet av etiketter, presenteras en ny oövervakad CD-modell som upptäcker översvämningar som ändringar på SAR-tidsseriedata. Den föreslagna modellen är tränad för att lära sig rumsliga egenskaper hos SAR-tidsseriedata med hjälp av oövervakade inlärningstekniker, rekonstruktion och kontrastiv inlärning. Förändringskartorna genereras med en ny algoritm som använder de inlärda latenta egenskapersfördelningarna av data före och efter översvämningen. Modellen uppnådde ett genomsnitt på 70% IoU-poäng, vilket överträffade befintliga översvämningsdetekteringsmodeller som RaVAEn (45,03% IoU-poäng), cGAN (51,49% IoUpoäng) och SCCN (54,87% IoU-poäng) med en betydande minimimarginal på 15% IoU-poäng. Den föreslagna modellen är testad för generaliserbarhet och överträffade de övervakade modellerna ADS-Net och DAUSAR när den testades på osynliga CEMS-översvämningsplatser. Dessutom föreslås ett ramverk för automatisk förändringsövervakning och ändringspunktsdetektering. Ramverket är baserat på den föreslagna oövervakade CD-modellen där tidsseriedata bearbetas genom modellen för att identifiera procentuell förändring vid varje tidsstämpel och ändringspunkten detekteras genom att identifiera det datum då betydande förändringar började återspeglas i SAR-data. När det är integrerat med hög tidsdata, dvs. dagliga bilder från ICEYE, kan ramverket hjälpa till med kontinuerlig översvämningsövervakning och tidig upptäckt av långsamt pågående katastrofhändelser, vilket ger mer tid för respons.

Sammantaget bidrar denna avhandling med övervakade och oövervakade översvämningsdetekteringsmodeller, vilket möjliggör omfattande och allmänt användbar översvämningskartläggning och övervakningskapacitet. Dessa framsteg underlättar katastrofhantering i nästan realtid och en motståndskraftig stadsutveckling och bidrar på så sätt till SDG 11 - Hållbara städer och samhällen.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2023. p. 69
Series
TRITA-ABE-DLT ; 2344
Keywords
Floods, Remote Sensing, Sentinel-1 SAR, Segmentation, Change Detection, DEM, Data Fusion, Time Series, Deep Learning, Unsupervised Learning, Contrastive Learning, Self-Attention, Convolutional LSTM, Variational AutoEncoder (VAE)
National Category
Earth Observation Computer Sciences
Research subject
Geodesy and Geoinformatics, Geoinformatics
Identifiers
urn:nbn:se:kth:diva-338909 (URN)978-91-8040-758-8 (ISBN)
Presentation
2023-11-15, Bora Bora, Teknikringen 10 B, KTH Campus, public video conference link https://kth-se.zoom.us/j/69141787499, Stockholm, 14:00 (English)
Opponent
Supervisors
Note

QC 20231030

Available from: 2023-10-30 Created: 2023-10-30 Last updated: 2025-02-10Bibliographically approved
Yadav, R., Nascetti, A. & Ban, Y. (2022). Attentive Dual Stream Siamese U-net for Flood Detection on Multi-temporal Sentinel-1 Data. In: Proceedings IEEE International Geoscience and Remote Sensing Symposium IGARSS 2022: . Paper presented at IEEE International Geoscience and Remote Sensing Symposium IGARSS 2022, Kuala Lumpur, Malaysia,17-22 July 2022 (pp. 5222-5225).
Open this publication in new window or tab >>Attentive Dual Stream Siamese U-net for Flood Detection on Multi-temporal Sentinel-1 Data
2022 (English)In: Proceedings IEEE International Geoscience and Remote Sensing Symposium IGARSS 2022, 2022, p. 5222-5225Conference paper, Published paper (Refereed)
Abstract [en]

Due to climate and land-use change, natural disasters such as flooding have been increasing in recent years. Timely and reliable flood detection and mapping can help emergency response and disaster management. In this work, we propose a flood detection network using bi-temporal SAR acquisitions. The proposed segmentation network has an encoder-decoder architecture with two Siamese encoders for pre and post-flood images. The network’s feature maps are fused and enhanced using attention blocks to achieve more accurate detection of the flooded areas. Our proposed network is evaluated on publicly available Sen1Flood11 [1] benchmark dataset. The network outperformed the existing state-of-the-art (uni-temporal) flood detection method by 6% IOU. The experiments highlight that the combination of bi-temporal SAR data with an effective network architecture achieves more accurate flood detection than uni-temporal methods.

Keywords
Flood Detection, bi-temporal, Change Detection, SAR, Siamese, Deep Learning, Encoder-Decoder, Attention
National Category
Computer Sciences Earth Observation
Identifiers
urn:nbn:se:kth:diva-313584 (URN)10.1109/IGARSS46834.2022.9883132 (DOI)000920916605075 ()2-s2.0-85140357258 (Scopus ID)
Conference
IEEE International Geoscience and Remote Sensing Symposium IGARSS 2022, Kuala Lumpur, Malaysia,17-22 July 2022
Note

QC 20220628

Available from: 2022-06-07 Created: 2022-06-07 Last updated: 2025-02-10Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0003-3599-3164

Search in DiVA

Show all publications