kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Satellite and UAV Imagery for Flood Mapping and Damage Assessment in Mozambique using Machine Learning
KTH, School of Architecture and the Built Environment (ABE), Urban Planning and Environment, Geoinformatics.ORCID iD: 0000-0003-4448-6180
2025 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Floods are becoming increasingly frequent and impactful worldwide, withtheir severity intensifying due to climate change. This growing threat hasmade all countries more vulnerable to natural disasters. Over the past fewdecades, Mozambique has been particularly affected by several tropical cyclones(TCs). In 2019, following the devastation caused by TCs Idai andKenneth, Mozambique became the first country in southern Africa to bestruck by two cyclones in the same rainy season. In 2023, it was hit twice bythe same cyclone, TC Freddy, which was also recorded as the longest-lastingcyclone on record.

Given the extent of the damage caused by such events, there is an urgentneed for efficient and cost-effective methods to map both flooded andflood-prone areas. These methods are essential for aiding local authorities indisaster preparedness, planning, and impact mitigation. Moreover, they playa vital role in providing information that supports evidence-based decisionmakingfor sustainable development. Several remote sensing (RS) approacheshave been proposed for post-flood assessment, including those based on machinelearning (ML) and deep learning (DL). While effective, these approachesoften require large amounts of annotated data and are typically task-specific,limiting their scalability and adaptability especially in data-scarce regions.

In this study, we investigate the use of multi-temporal Sentinel-1 (S1) SyntheticAperture Radar (SAR) and Sentinel-2 (S2) Multi-Spectral Instrument(MSI) data, along with other data sources, to develop scalable, cost-effective,and computationally efficient methods for near real-time flood mapping andflood damage assessment (DA) in Mozambique. Additionally, we explore theuse of Geo-Foundation Models (GFMs) on small datasets for flood mappingand DA, including ML-based alternatives to DL approaches.

As such, three approaches for flood mapping are proposed. The first isa fully automated method for near real-time flood mapping, utilizing multitemporalS1 data acquired over Beira municipality and the Macomia district.It identifies flooded areas by computing the difference between imagesacquired before and after the flooding event, followed by Otsu’s thresholdingmethod for automatic flood area extraction. The second approach employsboth supervised and unsupervised ML methods, such as Support VectorMachines (SVM) and K-Means clustering, leveraging a dataset provided byDrivenData, which was launched as part of a competition for flood mappingusing SAR data. This dataset, based on S1, includes VH and VV imageryand labeled data from 13 countries worldwide. By harnessing the processingcapability of the Google Earth Engine (GEE) platform, both approaches arepresented as an alternative to traditional DL methods due to cost-effectivenessand low computational power requirements. The third approach involves finetuninga GFM, named Clay, on the DrivenData dataset for the task of floodmapping. Foundation Models (FMs) refer to models that are pre-trained onbroad datasets typically using large-scale self-supervision and can be adapted(e.g., fine-tuned) for a wide range of downstream tasks. Clay was initiallyiipre-trained for segmentation, classification, and biomass information extractionusing a variety of sensors such as S1, S2, and Landsat. These models arereshaping how traditional ML and DL approaches are trained, significantlyreducing the amount of time and data required for training while maintaininghigh standards of result quality.

Furthermore, we explored the use of S2 MSI data to generate a land cover(LC) map of the study area and estimate the percentage of flooded areaswithin each LC class. The results demonstrate that the combination of S1and S2 data is a reliable approach for near real-time flood mapping and damageassessment. Using the first approach, we automatically mapped floodedareas with an overall accuracy of about 87–88% and kappa of 0.73–0.75. Thesecond approach also produced satisfactory results, revealing that VH polarizationand the combination of VV+VH performed better than using VVpolarization alone. Specifically, in Cambodia and Bolivia, VH polarizationyielded Intersection over Union (IoU) values ranging from 0.819 to 0.856.Predictions for Beira using VH imagery resulted in an IoU of 0.568, whichrepresents a reasonable outcome. The third approach achieved an IoU exceeding0.92 and an F1-score above 0.96, outperforming the winning DL solutionfrom the DrivenData competition, which attained an IoU of 0.8072 when thedataset was initially released.

The LC classification was validated by randomly collecting over 600 pointsfor each LC class, achieving an overall accuracy of 90–95% with a kappa valueof 0.80–0.94. These results enabled us to identify areas prone to flooding andregions where floodwaters recede more quickly, providing valuable insights forimproved planning. Additionally, we determined the percentage of floodedLC categories such as Agriculture, Mangrove, and Built-up areas, as theirdestruction has significant implications for food security and socio-economicdevelopment.

Furthermore, to obtain more detailed insights into the damage in Beira,we deployed Clay for the task of Building Damage Classification (BDC), finetuningit on the EDDA dataset. The EDDA dataset, released in 2023, consistsof geo-referenced drone imagery captured in Beira after TC Idai. The finetunedmodel achieved a validation IoU of 0.829, which was then comparedto the results from a U-Net implementation that yielded a validation IoU of0.567.

Therefore, the contribution of this thesis lies in providing practical, dataefficientsolutions that enhance local disaster management capabilities andcommunity resilience. We have demonstrated that while ML methods areefficient and cost-effective for near real-time flood mapping, particularly whencombined with Sentinel data, GFMs offer improved accuracy (even with asmall dataset), albeit with slightly higher computational requirements.

Abstract [sv]

Översvämningar blir allt vanligare och deras påverkan värre världen över, ochderas förmåga att göra skada ökar på grund av klimatförändringarna. Dettaväxande hot har gjort alla länder mer sårbara för naturkatastrofer. Under desenaste decennierna har Moçambique särskilt drabbats av flera tropiska cykloner(TC). År 2019, efter förödelsen orsakad av TCs Idai och Kenneth, blevMoçambique det första landet i södra Afrika som drabbades av två cyklonerunder samma regnperiod. År 2023 träffades det två gånger av samma cyklon,TC Freddy, som också registrerades som den längsta varaktiga cyklonennågonsin. Med tanke på omfattningen av de skador som sådana händelserorsakar finns det ett akut behov av effektiva och kostnadseffektiva metoderför att kartlägga både översvämmade och översvämningsdrabbade områden.Dessa metoder är viktiga för att hjälpa lokala myndigheter med katastrofberedskap,planering och begränsning av påverkan. Dessutom spelar de enviktig roll för att tillhandahålla information som stöder evidensbaserat beslutsfattandeför hållbar utveckling. Flera metoder för fjärranalys (RS) harföreslagits för bedömning efter översvämning, inklusive de som bygger på maskininlärning(ML) och djupinlärning (DL). Även om de är effektiva, kräverdessa tillvägagångssätt ofta stora mängder kommenterad data och är vanligtvisuppgiftsspecifika, vilket begränsar deras skalbarhet och anpassningsförmåga,särskilt i områden med brist på data. I den här studien undersökervi användningen av multi-temporal Sentinel-1 (S1) Synthetic Aperture Radar(SAR) och Sentinel-2 (S2) Multi-Spectral Instrument (MSI) data, tillsammansmed andra datakällor, för att utveckla skalbara, kostnadseffektivaoch beräkningseffektiva metoder för översvämningsbedömning (FM) och översvämningsskada(FM) i nära realtid. Dessutom utforskar vi användningen avGeo-Foundation Models (GFM) på små datamängder för FM och DA, inklusiveML-baserade alternativ till DL-metoder. Som sådan föreslås tre tillvägagångssättför FM. Den första är en helt automatiserad metod för nära realtidFM, som använder multi-temporal S1-data som förvärvats över Beira kommunoch Macomia-distriktet. Den identifierar översvämmade områden genom attberäkna skillnaden mellan bilder som tagits före och efter översvämningshändelsen,följt av Otsu’s tröskelmetod för automatisk utvinning av översvämningsområden.Det andra tillvägagångssättet använder både övervakade ochoövervakade ML-metoder, såsom Support Vector Machines (SVM) och KMeans-klustring, som utnyttjar ett dataset från DrivenData, som lanseradessom en del av en tävling för FM som använder SAR-data. Denna datauppsättning,baserad på S1, inkluderar VH- och VV-bilder och annoterad data från13 länder över hela världen. Genom att utnyttja bearbetningsförmågan hosGoogle Earth Engine-plattformen (GEE) presenteras båda metoderna somett alternativ till traditionella DL-metoder på grund av kostnadseffektivitetoch låga krav på beräkningskraft. Det tredje tillvägagångssättet innebär attfinjustera en GFM, kallad Clay, med DrivenData-datasetet för FM-uppgiften.Grundmodeller (FoMs) hänvisar till modeller som är förtränade på breda datauppsättningarsom vanligtvis använder storskalig självövervakning och somkan anpassas (t.ex. finjusteras) för ett brett utbud av nedströmsuppgifter.Clay var från början avsedd för segmentering, klassificering och utvinning avivbiomassainformation med hjälp av en mängd olika sensorer som S1, S2 ochLandsat. Dessa modeller omformar hur traditionella ML- och DL-metodertränas, vilket avsevärt minskar mängden tid och data som krävs för träningsamtidigt som höga standarder för resultatkvalitet bibehålls. Dessutom undersöktevi användningen av S2 MSI-data för att generera en landtäckningskarta(LC) över studieområdet och uppskatta andelen översvämmade områden inomvarje LC-klass. Resultaten visar att kombinationen av S1- och S2-data ärett tillförlitligt tillvägagångssätt för översvämningskartläggning och skadebedömningi nästan realtid. Med den första metoden kartlade vi automatisktöversvämmade områden med en total noggrannhet på cirka 87–88% och kappapå 0,73–0,75. Det andra tillvägagångssättet gav också tillfredsställanderesultat, vilket avslöjade att VH-polarisering och kombinationen av VV+VHfungerade bättre än att använda enbart VV-polarisering. I synnerhet i Kambodjaoch Bolivia, gav VH-polarisering Intersection over Union-värden (IoU)från 0,819 till 0,856. Förutsägelser för Beira med VH-bilder resulterade i ettIoU på 0,568, vilket representerar ett rimligt resultat. Det tredje tillvägagångssättetuppnådde ett IoU som översteg 0,92 och ett F1-poäng över 0,96, vilketöverträffade den vinnande DL-lösningen från DrivenData-tävlingen, som uppnåddeett IoU på 0,8072 när datasetet initialt släpptes. LC-klassificeringenvaliderades genom att slumpmässigt samla in över 600 poäng för varje LCklass,vilket uppnådde en total noggrannhet på 90–95% med ett kappavärdepå 0,80–0,94. Dessa resultat gjorde det möjligt för oss att identifiera områdensom är utsatta för översvämning och regioner där översvämningsvattnet minskarsnabbare, vilket ger värdefulla insikter för förbättrad planering. Dessutombestämde vi andelen översvämmade LC-kategorier som jordbruk, mangroveoch bebyggda områden, eftersom deras förstörelse har betydande konsekvenserför livsmedelssäkerhet och socioekonomisk utveckling. Dessutom, för att fåmer detaljerade insikter om skadorna i Beira, distribuerade vi Clay för uppgiftenBuilding Damage Classification (BDC), och finjusterade den på EDDAdatauppsättningen.EDDA-datauppsättningen, som släpptes 2023, består avgeorefererade drönarebilder som tagits i Beira efter den tropiska cyklonenIdai. Den finjusterade modellen uppnådde en validerings-IoU på 0,829, somsedan jämfördes med resultaten från en U-Net-implementering som gav envaliderings-IoU på 0,567. Därför ligger bidraget från denna avhandling i atttillhandahålla praktiska, dataeffektiva lösningar som förbättrar lokal katastrofhanteringskapacitetoch samhällets motståndskraft. Vi har visat att ävenom ML-metoder är effektiva och kostnadseffektiva för nära realtids-FM, särskilti kombination med Sentinel-data, erbjuder GFM förbättrad noggrannhet(även med en liten datamängd), om än med något högre beräkningskrav.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2025. , p. 86
Series
TRITA-ABE-DLT ; 2511
Keywords [en]
Geo-Foundation Models, Machine Learning, Sentinel 1 and 2, Flood Mapping, Classification, Damage Assessment
National Category
Earth and Related Environmental Sciences
Research subject
Geodesy and Geoinformatics, Geoinformatics
Identifiers
URN: urn:nbn:se:kth:diva-363806ISBN: 978-91-8106-329-5 (print)OAI: oai:DiVA.org:kth-363806DiVA, id: diva2:1959958
Public defence
2025-06-12, D3, Lindstedtsvägen 9, KTH Campus, public video conference link https://kth-se.zoom.us/j/67206163625, Stockholm, 09:30 (English)
Opponent
Supervisors
Note

QC 20250523

Available from: 2025-05-23 Created: 2025-05-21 Last updated: 2025-07-08Bibliographically approved
List of papers
1. Multi-Temporal Sentinel-1 SAR and Sentinel-2 MSI Data for Flood Mapping and Damage Assessment in Mozambique
Open this publication in new window or tab >>Multi-Temporal Sentinel-1 SAR and Sentinel-2 MSI Data for Flood Mapping and Damage Assessment in Mozambique
2023 (English)In: ISPRS International Journal of Geo-Information, ISSN 2220-9964, Vol. 12, no 2, article id 53Article in journal (Refereed) Published
Abstract [en]

Floods are one of the most frequent natural disasters worldwide. Although the vulnerability varies from region to region, all countries are susceptible to flooding. Mozambique was hit by several cyclones in the last few decades, and in 2019, after cyclones Idai and Kenneth, the country became the first one in southern Africa to be hit by two cyclones in the same raining season. Aiming to provide the local authorities with tools to yield better responses before and after any disaster event, and to mitigate the impact and support in decision making for sustainable development, it is fundamental to continue investigating reliable methods for disaster management. In this paper, we propose a fully automated method for flood mapping in near real-time utilizing multi-temporal Sentinel-1 Synthetic Aperture Radar (SAR) data acquired in the Beira municipality and Macomia district. The procedure exploits the processing capability of the Google Earth Engine (GEE) platform. We map flooded areas by finding the differences of images acquired before and after the flooding and then use Otsu's thresholding method to automatically extract the flooded area from the difference image. To validate and compute the accuracy of the proposed technique, we compare our results with the Copernicus Emergency Management Service (Copernicus EMS) data available in the study areas. Furthermore, we investigated the use of a Sentinel-2 multi-spectral instrument (MSI) to produce a land cover (LC) map of the study area and estimate the percentage of flooded areas in each LC class. The results show that the combination of Sentinel-1 SAR and Sentinel-2 MSI data is reliable for near real-time flood mapping and damage assessment. We automatically mapped flooded areas with an overall accuracy of about 87-88% and kappa of 0.73-0.75 by directly comparing our prediction and Copernicus EMS maps. The LC classification is validated by randomly collecting over 600 points for each LC, and the overall accuracy is 90-95% with a kappa of 0.80-0.94.

Place, publisher, year, edition, pages
MDPI AG, 2023
Keywords
Sentinel-1 and Sentinel-2 imagery, flood mapping, land cover classification, damage assessment
National Category
Earth Observation
Identifiers
urn:nbn:se:kth:diva-325098 (URN)10.3390/ijgi12020053 (DOI)000939077000001 ()2-s2.0-85148769365 (Scopus ID)
Note

QC 20230329

Available from: 2023-03-29 Created: 2023-03-29 Last updated: 2025-05-23Bibliographically approved
2. Supervised and unsupervised machine learning approaches using Sentinel data for flood mapping and damage assessment in Mozambique
Open this publication in new window or tab >>Supervised and unsupervised machine learning approaches using Sentinel data for flood mapping and damage assessment in Mozambique
2023 (English)In: Remote Sensing Applications: Society and Environment, E-ISSN 2352-9385, Vol. 32, article id 101015Article in journal (Refereed) Published
Abstract [en]

Natural hazards, such as flooding, have been negatively impacting developed and emerging economies alike. The effects of floods are more prominent in countries of the Global South, where large parts of the population and infrastructure are insufficiently protected from natural hazards. From this scope, a lot of effort is required to mitigate these impacts by continuously providing new and more reliable tools to aid in mitigation and preparedness, during or after a flood event. Flood mapping followed by damage assessment plays an important role in all these stages. In this work we investigate a new dataset provided by DrivenData Labs based on Sentinel-1 (S1) imagery (VH, VV imagery and labels) to help map floods in the city of Beira in Mozambique. Exploiting Google Earth Engine (GEE), we deployed supervised and unsupervised machine learning (ML) methods on a dataset comprising imagery from 13 countries worldwide. We first mapped the floods country-by-country including Mozambique. This first part was helpful to understand the sensitivity of each method when applied to data from different regions and with different polarizations. We then trained the supervised model globally (in all 13 countries) and used it to predict floods in Beira. To assess the accuracy of the experiments we used the intersection over the union (IoU) metric, results of which we compared with the benchmark IoU achieved by the winner in the DrivenData competition for flood mapping in 2021. The implementation of unsupervised and supervised ML using VH and VV+VH produced satisfactory results, and showed to be better than using VV imagery; in Cambodia and Bolivia with VH polarization yielded IoUs values ranging from 0.819 to 0.856 which is above the benchmark (0.8094). The predictions in Beira using VH imagery yielded IoU of 0.568, which is a reasonable outcome. The proposed approach is a reliable alternative for flood mapping, especially in Mozambique due to its low cost and time effectiveness as even with unsupervised approaches, relatively high-quality results are yielded in near real-time. Finally, we used Sentinel-2 (S2) imagery for a land cover classification to perform damage assessment in Beira and integrated population data from Beira to enhance the quality the results. The results show that 20% of agricultural area and about 10% of built up area were flooded. Flooded built up area includes highly populated neighborhoods such as Chaimite and Ponta Gea that are located in the center of the city.

Place, publisher, year, edition, pages
Elsevier BV, 2023
Keywords
Classification, Damage assessment, DrivenData dataset, Flood mapping, Sentinel-1 and Sentinel-2
National Category
Earth Observation
Identifiers
urn:nbn:se:kth:diva-333894 (URN)10.1016/j.rsase.2023.101015 (DOI)001054671800001 ()2-s2.0-85164383013 (Scopus ID)
Note

QC 20230824

Available from: 2023-08-24 Created: 2023-08-24 Last updated: 2025-05-23Bibliographically approved
3. Geo-foundation models and Sentinel-1 data for flood mapping
Open this publication in new window or tab >>Geo-foundation models and Sentinel-1 data for flood mapping
(English)Manuscript (preprint) (Other academic)
National Category
Environmental Sciences
Identifiers
urn:nbn:se:kth:diva-363892 (URN)
Note

QC 20250602

Available from: 2025-05-23 Created: 2025-05-23 Last updated: 2025-06-02Bibliographically approved
4. Geo-foundation models and UAV data for post flooding damage assessment in Mozambique
Open this publication in new window or tab >>Geo-foundation models and UAV data for post flooding damage assessment in Mozambique
(English)Manuscript (preprint) (Other academic)
National Category
Environmental Sciences
Identifiers
urn:nbn:se:kth:diva-363893 (URN)
Note

QC 20250602

Available from: 2025-05-23 Created: 2025-05-23 Last updated: 2025-06-02Bibliographically approved

Open Access in DiVA

Doctoral Thesis(60729 kB)388 downloads
File information
File name FULLTEXT01.pdfFile size 60729 kBChecksum SHA-512
9e29101f7e9e85e9311c4110873a13881c3f037f4447f8230a1fe7b77cc645d3b3e0e0536303072406ebddd074e197ba12f54eb4117c5edfc78e959f646ee273
Type summaryMimetype application/pdf

Search in DiVA

By author/editor
Nhangumbe, Manuel
By organisation
Geoinformatics
Earth and Related Environmental Sciences

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 414 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf