kth.sePublications KTH
Change search
Link to record
Permanent link

Direct link
Publications (10 of 20) Show all publications
Zhao, Y. & Ban, Y. (2025). Assessment of L-Band and C-Band SAR on Burned Area Mapping of Multiseverity Forest Fires Using Deep Learning. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 18, 14148-14159
Open this publication in new window or tab >>Assessment of L-Band and C-Band SAR on Burned Area Mapping of Multiseverity Forest Fires Using Deep Learning
2025 (English)In: IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, ISSN 1939-1404, E-ISSN 2151-1535, Vol. 18, p. 14148-14159Article in journal (Refereed) Published
Abstract [en]

Earth observation-based burned area mapping is critical for evaluating the impact of wildfires on ecosystems. Optical satellite data from Landsat and Sentinel-2 are often used to map burned areas. However, they suffer from interference caused by clouds and smoke. Capable of penetrating through clouds and smoke, synthetic aperture radar (SAR) at C- and L-band is also widely used for burned area mapping. With a longer wavelength than C-band SAR, L-band SAR is more sensitive to trunks and branches. Conversely, C-band SAR is sensitive to tree canopy leaves. Thus, the wavelength differences between the two types of sensors result in varying abilities to detect burned areas with different burn severities, as different burn severities cause structural changes in the forests. This research compares ALOS Phased-Array L-band Synthetic Aperture Radar-2 to Sentinel-1 C-band SAR for mapping burned areas across low, medium, and high burn severities. Moreover, a deep-learning-based workflow is utilized to segment burned area maps from both C-band and L-band images. ConvNet-based and transformer-based segmentation models are trained and tested on global wildfires in broadleaf and needle-leaf forests. The results indicate that L-band data show higher backscatter changes compared to C-band data for low and medium severity. In addition, the segmentation models with L-band data as input achieve higher F1 (0.840) and IoU (0.729) scores than models with C-band data (0.757, 0.630). Finally, the ablation study tested different combinations of input bands and the effectiveness of total-variation loss. The study highlights the importance of SAR Log-ratio images as input and demonstrates that total-variation loss can reduce the noise in SAR images and improve segmentation accuracy.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2025
Keywords
L-band, C-band, Synthetic aperture radar, Image segmentation, Deep learning, Wildfires, Transformers, Sentinel-1, Forestry, Backscatter, Burned area mapping, Phased Array L-band Synthetic Aperture Radar-2 (PALSAR), remote sensing, wildfire monitoring
National Category
Earth Observation Physical Geography
Identifiers
urn:nbn:se:kth:diva-368400 (URN)10.1109/JSTARS.2025.3560287 (DOI)001508110200001 ()2-s2.0-105002474080 (Scopus ID)
Note

QC 20250818

Available from: 2025-08-18 Created: 2025-08-18 Last updated: 2025-08-18Bibliographically approved
Xu, Z., Li, J., Cheng, S., Rui, X., Zhao, Y., He, H., . . . Xu, L. L. (2025). Deep learning for wildfire risk prediction: Integrating remote sensing and environmental data. ISPRS journal of photogrammetry and remote sensing (Print), 227, 632-677
Open this publication in new window or tab >>Deep learning for wildfire risk prediction: Integrating remote sensing and environmental data
Show others...
2025 (English)In: ISPRS journal of photogrammetry and remote sensing (Print), ISSN 0924-2716, E-ISSN 1872-8235, Vol. 227, p. 632-677Article, review/survey (Refereed) Published
Abstract [en]

Wildfires pose a significant threat to ecosystems, wildlife, and human communities, leading to habitat destruction, pollutant emissions, and biodiversity loss. Accurate wildfire risk prediction is crucial for mitigating these impacts and safeguarding both environmental and human health. This paper provides a comprehensive review of wildfire risk prediction methodologies, with a particular focus on deep learning approaches combined with remote sensing. We begin by defining wildfire risk and summarizing the geographical distribution of related studies. In terms of data, we analyze key predictive features, including fuel characteristics, meteorological and climatic conditions, socioeconomic factors, topography, and hydrology, while also reviewing publicly available wildfire prediction datasets derived from remote sensing. Additionally, we emphasize the importance of feature collinearity assessment and model interpretability to improve the understanding of prediction outcomes. Regarding methodology, we classify deep learning models into three primary categories: time-series forecasting, image segmentation, and spatiotemporal prediction, and further discuss methods for converting model outputs into risk classifications or probability-adjusted predictions. Finally, we identify the key challenges and limitations of current wildfire-risk prediction models and outline several research opportunities. These include integrating diverse remote sensing data, developing multimodal models, designing more computationally efficient architectures, and incorporating cross-disciplinary methods—such as coupling with numerical weather-prediction models—to enhance the accuracy and robustness of wildfire-risk assessments.

Place, publisher, year, edition, pages
Elsevier BV, 2025
Keywords
Deep learning, Remote sensing, Review, Risk prediction, Wildfire
National Category
Earth Observation Artificial Intelligence
Identifiers
urn:nbn:se:kth:diva-368895 (URN)10.1016/j.isprsjprs.2025.06.002 (DOI)001528940300001 ()2-s2.0-105009688076 (Scopus ID)
Note

QC 20250822

Available from: 2025-08-22 Created: 2025-08-22 Last updated: 2025-10-24Bibliographically approved
Zhao, Y. & Ban, Y. (2025). Near real-time wildfire progression mapping with VIIRS time-series and autoregressive SwinUNETR. International Journal of Applied Earth Observation and Geoinformation, 136, Article ID 104358.
Open this publication in new window or tab >>Near real-time wildfire progression mapping with VIIRS time-series and autoregressive SwinUNETR
2025 (English)In: International Journal of Applied Earth Observation and Geoinformation, ISSN 1569-8432, E-ISSN 1872-826X, Vol. 136, article id 104358Article in journal (Refereed) Published
Abstract [en]

Wildfire management and response requires frequent and accurate burned area mapping. How to map daily burned areas with satisfactory accuracy remains challenging due to missed detections caused by accumulating active fire points as well as the low temporal resolution of sensors onboard satellites like Sentinel-2/Landsat-8/9 and monthly burned area product generated from the Visible Infrared Imaging Radiometer Suite (VIIRS) data. ConvNet-based and Transformer-based deep-learning models are widely applied to mid-spatial-resolution satellite images. But these models perform poorly on low-spatial-resolution images. Also, cloud interference is one major issue when continuously monitoring the burned area. To improve detection accuracy and reduce cloud inference by combining temporal and spatial information, we propose an autoregressive spatial–temporal model AR-SwinUNETR to segment daily burned areas from VIIRS time-series. AR-SwinUNETR processes the image time-series as a 3D tensor but considers the temporal connections between images in the time-series by applying an autoregressive mask in Swin-Transformer Block. The model is trained with 2017-2020 wildfire events in the US and validated on 2021 US wildfire events. The quantitative results indicate AR-SwinUNETR can achieve a higher F1-Score than baseline deep learning models. The quantitative results of testset which consists of eight 2023 long-duration wildfires in Canada show a better F1 Score (0.757) and IoU Score (0.607) than baseline accumulated VIIRS Active Fire Hotspots (0.715) and IoU Score (0.557) compared with labels generated from Sentinel-2 images. In conclusion, the proposed AR-SwinUNETR with VIIRS image time-series can efficiently detect daily burned area providing better accuracy than direct burned area mapping with VIIRS active fire hotspots. Also, burned area mapping using VIIRS time-series and AR-SwinUNETR keeps a high temporal resolution (daily) compared to other burned area mapping products. The qualitative results also show improvements in detecting burned areas with cloudy images.

Place, publisher, year, edition, pages
Elsevier BV, 2025
Keywords
Burned area mapping, Disaster response, Image segmentation, Remote sensing, Swin-Transformer, VIIRS, Wildfire monitoring
National Category
Earth Observation
Identifiers
urn:nbn:se:kth:diva-358890 (URN)10.1016/j.jag.2025.104358 (DOI)001416930000001 ()2-s2.0-85214833274 (Scopus ID)
Note

Not duplicate with DiVA 1913766

QC 20250123

Available from: 2025-01-23 Created: 2025-01-23 Last updated: 2025-02-26Bibliographically approved
Zhao, Y. & Ban, Y. (2025). RADARSAT constellation mission compact polarisation SAR data for burned area mapping with deep learning. International Journal of Applied Earth Observation and Geoinformation, 141, Article ID 104615.
Open this publication in new window or tab >>RADARSAT constellation mission compact polarisation SAR data for burned area mapping with deep learning
2025 (English)In: International Journal of Applied Earth Observation and Geoinformation, ISSN 1569-8432, E-ISSN 1872-826X, Vol. 141, article id 104615Article in journal (Refereed) Published
Abstract [en]

Monitoring wildfires has become increasingly critical due to the sharp rise in wildfire incidents in recent years. Optical satellites like Sentinel-2 and Landsat are extensively utilised for mapping burned areas. However, the effectiveness of optical sensors is compromised by clouds and smoke, which obstruct the detection of burned areas. Thus, satellites equipped with Synthetic Aperture Radar (SAR), such as dual-polarisation Sentinel-1 and quad-polarisation RADARSAT-1/-2 C-band SAR, which can penetrate clouds and smoke, are investigated for mapping burned areas. However, there is limited research on using compact polarisation (compact-pol) C-band RADARSAT Constellation Mission (RCM) SAR data for this purpose. This study aims to investigate the capacity of compact polarisation RCM data for burned area mapping through deep learning. Compact-pol m-χ decomposition and Compact-pol Radar Vegetation Index (CpRVI) are derived from the RCM Multi-Look Complex product. A deep-learning-based processing pipeline incorporating ConvNet-based and Transformer-based models is applied for burned area mapping, with three different input settings: using only log-ratio dual-polarisation intensity images, using only compact-pol decomposition plus CpRVI, and using all three data sources. The training dataset comprises 46,295 patches, generated from 12 major wildfire events in Canada. The test dataset includes seven wildfire events from the 2023 and 2024 Canadian wildfire seasons in Alberta, British Columbia, Quebec and the Northwest Territories. The results demonstrate that compact-pol m-χ decomposition and CpRVI images significantly complement log-ratio images for burned area mapping. The best-performing Transformer-based model, UNETR, trained with log-ratio, m-χ m-decomposition, and CpRVI data, achieved an F1 Score of 0.718 and an IoU Score of 0.565, showing a notable improvement compared to the same model trained using only log-ratio images (F1 Score: 0.684, IoU Score: 0.557). This is the first study to demonstrate that RCM C-band SAR data and its derived features are effective for burned area mapping.

Place, publisher, year, edition, pages
Elsevier BV, 2025
Keywords
Burned area mapping, Compact polarisation, Decomposition, Deep learning, Radar vegetation index, RADARSAT constellation mission, SAR
National Category
Earth Observation Signal Processing
Identifiers
urn:nbn:se:kth:diva-366003 (URN)10.1016/j.jag.2025.104615 (DOI)001515278300001 ()2-s2.0-105007558441 (Scopus ID)
Note

Not duplicate with DiVA 1913771

QC 20250704

Available from: 2025-07-04 Created: 2025-07-04 Last updated: 2025-09-22Bibliographically approved
Zhao, Y., Gerard, S. & Ban, Y. (2025). TS-SatFire: A Multi-Task Satellite Image Time-Series Dataset for Wildfire Detection and Prediction. Scientific Data, 12(1), Article ID 1817.
Open this publication in new window or tab >>TS-SatFire: A Multi-Task Satellite Image Time-Series Dataset for Wildfire Detection and Prediction
2025 (English)In: Scientific Data, E-ISSN 2052-4463, Vol. 12, no 1, article id 1817Article in journal (Refereed) Published
Abstract [en]

Wildfire monitoring and prediction are essential for understanding wildfire behaviour. With extensive Earth observation data, these tasks can be integrated and enhanced through multi-task deep learning models. We present a comprehensive multi-temporal remote sensing dataset for active fire detection, daily wildfire monitoring, and next-day wildfire prediction. Covering wildfire events in the contiguous U.S. from January 2017 to October 2021, the dataset includes 3552 surface reflectance images and auxiliary data such as weather, topography, land cover, and fuel information, totalling 71 GB. Each wildfire’s lifecycle is documented, with labels for active fires (AF) and burned areas (BA), supported by manual quality assurance of AF and BA test labels. The dataset supports three tasks: a) active fire detection, b) daily burned area mapping, and c) wildfire progression prediction. Detection tasks use pixel-wise classification of multi-spectral, multi-temporal images, while prediction tasks integrate satellite and auxiliary data to model fire dynamics. This dataset and its benchmarks provide a foundation for advancing wildfire research using deep learning.

Place, publisher, year, edition, pages
Springer Nature, 2025
National Category
Earth Observation Computer graphics and computer vision
Identifiers
urn:nbn:se:kth:diva-374099 (URN)10.1038/s41597-025-06271-3 (DOI)001618995000012 ()41258139 (PubMedID)2-s2.0-105022315025 (Scopus ID)
Note

QC 20251216

Available from: 2025-12-16 Created: 2025-12-16 Last updated: 2025-12-16Bibliographically approved
Zhao, Y. & Ban, Y. (2024). Burned area mapping with radarsat constellation mission data and deep learning. In: IGARSS 2024-2024 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, IGARSS 2024: . Paper presented at IEEE International Geoscience and Remote Sensing Symposium (IGARSS), JUL 07-12, 2024, Athens, GREECE (pp. 4553-4556). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Burned area mapping with radarsat constellation mission data and deep learning
2024 (English)In: IGARSS 2024-2024 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, IGARSS 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024, p. 4553-4556Conference paper, Published paper (Refereed)
Abstract [en]

Monitoring wildfires has become increasingly critical due to the sharp rise in wildfire incidents in recent years. Optical satellites like Sentinel-2 and Landsat are extensively utilized for mapping burned areas. However, the effectiveness of optical sensors is compromised by clouds and smoke, which obstruct the detection of burned areas. As a result, there is growing interest in satellites equipped with Synthetic Aperture Radar (SAR), which can penetrate clouds and smoke. Previous studies have investigated the potential of Sentinel-1 and RADARSAT-1/-2 C-band SAR for burned area mapping. However, to the best of our knowledge, no published research is found using RADARSAT Constellation Mission (RCM) SAR data for this purpose. The objective of this study is to investigate RCM SAR data for burned area mapping using deep learning. We propose a deep-learning-based processing pipeline specifically for RCM data. The deep learning-based pipeline utilizes the U-Net as the segmentation model. The training samples are preprocessed to generate log-ratio images based on the same beam mode. The training labels are generated from binarized log-ratio images and Sentinel2 polygons. Our results demonstrate that RCM data can effectively detect burned areas in the 2023 Canadian Wildfires, achieving an F1 Score of 0.765 and an IoU Score of 0.620 for the study area in Alberta, and an F1 Score of 0.655 and an IoU Score of 0.487 for the study area in Quebec. These results indicate the promising potential of RCM data in wildfire monitoring.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Series
IEEE International Symposium on Geoscience and Remote Sensing IGARSS, ISSN 2153-6996
Keywords
RADARSAT Constellation Mission, Burned Area Mapping, C-Band, SAR, Deep Learning
National Category
Earth Observation
Identifiers
urn:nbn:se:kth:diva-360951 (URN)10.1109/IGARSS53475.2024.10640398 (DOI)001316158504204 ()2-s2.0-85204869960 (Scopus ID)
Conference
IEEE International Geoscience and Remote Sensing Symposium (IGARSS), JUL 07-12, 2024, Athens, GREECE
Note

Part of ISBN 979-8-3503-6033-2, 979-8-3503-6032-5

QC 20250310

Available from: 2025-03-10 Created: 2025-03-10 Last updated: 2025-03-10Bibliographically approved
Zhao, Y. (2024). Deep Learning for Wildfire Detection Using Multi-Sensor Multi-Resolution Satellite Images. (Doctoral dissertation). KTH Royal Institute of Technology
Open this publication in new window or tab >>Deep Learning for Wildfire Detection Using Multi-Sensor Multi-Resolution Satellite Images
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

In recent years, climate change and human activities have caused increasing numbers of wildfires. Earth observation data with various spatial and temporal resolutions have shown great potential in detecting and monitoring wildfires. Sensors with different spatial and temporal resolutions detect wildfires in different stages. For low spatial resolution and high temporal resolution satellites, they are mostly used in active fire detection and early-stage burned area mapping because of their frequent revisit. While these products are very useful, the existing solutions have flaws, including many false alarms due to cloud cover or buildings with roofs in high temperatures. Also, the multi-criteria threshold-based method does not leverage rich temporal information of each pixel at different timestamps and rich spatial information between neighbouring pixels. Therefore, advanced processing algorithms are needed to detect active fires. For medium spatial resolution and low temporal resolution satellites, they are often used to detect post-fire burned areas. Optical sensors like Sentinel-2 and Landsat-8/9 are commonly used but their low temporal resolution makes them difficult to monitor ongoing wildfire as they are likely to be affected by clouds and smoke. Synthetic Aperture Radar (SAR) satellites like Sentinel-1, ALOS-2 and RADARSAR Constellation Mission (RCM) can penetrate through the cloud and their spatial resolutions are around 30 meters. However, limited studies have compared the effectiveness of C-band and L-band data and investigating the usage of compact polarization on burned area mapping.

The main objective of this thesis is to develop deep learning methods for improved active fire detection, daily burned area mapping and post-fire burned area mapping utilizing multi-sensor multi-resolution earth observation images. 

 Temporal models such as Gated Recurrent Unit (GRU), Long Short-Term Memory (LSTM), and Transformer networks are promising for effectively capturing temporal information embedded in the image time-series produced by high temporal resolution sensors. Spatial models, including ConvNet-based and Transformer-based architectures, are well-suited for leveraging the rich spatial details in images from mid-resolution sensors. Furthermore, when dealing with image time-series that contain both abundant temporal and spatial information, spatial-temporal models like 3D ConvNet-based and Transformer-based models are ideal for addressing the task. 

In this thesis, the GRU-based GOES-R early detection method consists of a 5-layer GRU network that utilizes GOES-R ABI pixel time-series and classifies the active fire pixels at each time step. For 36 study areas, the proposed method detects 26 wildfires earlier than VIIRS active fire product. Moreover, the method mitigates the problem of coarse resolution of GOES-R ABI images by upsampling and the results show more reliable early-stage active fire location and suppresses the noise compared to GOES-R active fire product.

Furthermore, the VIIRS time-series images are investigated for both active fire detection and daily burned area mapping. For active fire detection, the image time-series are tokenized into vectors of pixel time-series as the input to the proposed Transformer model. For daily burned area mapping, the 3-dimensional Swin-Transformer model is directly applied to the image time-series. The attention mechanism of the Transformer helps to find the spatial-temporal relations of the pixel. By detecting the variation of the pixel values, the proposed model classifies the pixel at different time steps as an active fire pixel or a non-fire pixel. The proposed method is tested over 18 study areas across different regions and provides a 0.804 F1-Score. It outperforms the VIIRS active fire products from NASA which has a 0.663 F1-Score. For daily burned area mapping, it also outperforms the accumulation of VIIRS active fire hotspots in the F1 Score (0.811 vs 0.730). Also, the Transformer model is proven to be superior for active fire detection to other sequential models GRU and spatial models like U-Net. Additionally, for burned area detection, the proposed AR-SwinUNETR also shows superior performance over spatial models and other baseline spatial-temporal models.

To address the limitation of optical images due to cloud cover,  C-bBand data from Sentinel-1 and RCM, as well as L-band data from ALOS-2 PALSAR-2, are evaluated for post-fire burned area detection. To assess the effectiveness of SAR at different wavelengths, the performance of the same deep learning model is cross-compared on burned areas of varying severities in broadleaf and needleleaf forests using both Sentinel-1 SAR and PALSAR-2 SAR data. The results indicate that L-band SAR is more sensitive to detecting low and medium burn severities. Overall, models using L-band data achieve superior performance, with an F1 Score of 0.840 and an IoU Score of 0.729, compared to models using C-band data, which scored 0.757 and 0.630, respectively, across 12 test wildfires. For the RCM data, which provides compact polarization (compact-pol) at C-band, the inclusion of features generated from m-$\chi$ compact polarization decomposition and the radar vegetation index, combined with the original images, further enhances performance. The results demonstrate that leveraging polarization decomposition and the radar vegetation index improves detection accuracy for baseline deep learning models compared to using compact-pol images alone.

In conclusion, this thesis demonstrates the potential of advanced deep learning methods and multi-sensor Earth observation data for improving wildfire detection and burned area mapping, achieving superior performance across various sensors and methodologies.

Abstract [sv]

De senaste åren har klimatförändringar och mänskliga aktiviteter orsakat ett ökande antal skogsbränder. Jordobservationsdata med olika rumsliga och tidsmässiga upplösningar har visat stor potential för att upptäcka och övervaka skogsbränder. Sensorer med olika rumsliga och tidsmässiga upplösningar upptäcker skogsbränder i olika steg. För satelliter med låg rumslig upplösning och hög tidsupplösning används de mest i aktiv branddetektering och kartläggning av brända områden i ett tidigt skede på grund av deras frekventa återbesök. Även om dessa produkter är mycket användbara har de befintliga lösningarna brister, inklusive många falska larm på grund av molntäcke eller byggnader med tak i höga temperaturer. Den tröskelbaserade metoden med flera kriterier utnyttjar inte heller rik tidsinformation för varje pixel vid olika tidsstämplar och rik rumslig information mellan angränsande pixlar. Därför behövs avancerade bearbetningsalgoritmer för att upptäcka aktiva bränder. För satelliter med medium rumslig upplösning och låg tidsupplösning används de ofta för att upptäcka brända områden efter brand. Optiska sensorer som Sentinel-2 och Landsat-8/9 används ofta men deras låga tidsupplösning gör dem svåra att övervaka pågående löpeld eftersom de sannolikt kommer att påverkas av moln och rök. Synthetic Aperture Radar (SAR) satelliter som Sentinel-1, ALOS-2 och RADARSAR Constellation Mission (RCM) kan penetrera genom molnet och deras rumsliga upplösningar är cirka 30 meter. Emellertid har begränsade studier jämfört effektiviteten av C-bands- och L-bandsdata och undersökt användningen av kompakt polarisering på kartläggning av brända områden.

Huvudsyftet med detta examensarbete är att utveckla metoder för djupinlärning för förbättrad aktiv branddetektering, daglig kartläggning av brända områden och kartläggning av brända områden efter brand med hjälp av multi-sensor flerupplösta jordobservationsbilder.Temporala modeller såsom Gated Recurrent Unit (GRU), Long Short-Term Memory (LSTM) och Transformer-nätverk lovar att effektivt fånga tidsinformation inbäddad i bildtidsserierna som produceras av sensorer med hög tidsupplösning. Rumsliga modeller, inklusive ConvNet-baserade och Transformer-baserade arkitekturer, är väl lämpade för att utnyttja de rika rumsliga detaljerna i bilder från medelupplösningssensorer. Dessutom, när det handlar om bildtidsserier som innehåller både riklig tids- och rumsinformation, är rumsliga-temporala modeller som 3D ConvNet-baserade och Transformer-baserade modeller idealiska för att ta itu med uppgiften. 

I detta examensarbete består den GRU-baserade GOES-R tidig detekteringsmetoden av ett 5-lagers GRU-nätverk som använder GOES-R ABI-pixeltidsserier och klassificerar de aktiva brandpixlarna vid varje tidssteg. För 36 studieområden upptäcker den föreslagna metoden 26 skogsbränder tidigare än VIIRS aktiva brandprodukt. Dessutom mildrar metoden problemet med grov upplösning av GOES-R ABI-bilder genom uppsampling och resultaten visar mer tillförlitlig lokalisering av aktiv brand i tidigt skede och dämpar bruset jämfört med GOES-R aktiv brandprodukt.

Vidare undersöks VIIRS tidsseriebilder för både aktiv branddetektering och daglig kartläggning av brända områden. För aktiv branddetektering tokeniseras bildtidsserierna till vektorer av pixeltidsserier som indata till den föreslagna transformatormodellen. För daglig kartläggning av brända områden appliceras den 3-dimensionella Swin-Transformer-modellen direkt på bildtidsserien. Transformatorns uppmärksamhetsmekanism hjälper till att hitta pixelns rumsliga-temporala relationer. Genom att detektera variationen av pixelvärdena klassificerar den föreslagna modellen pixeln vid olika tidssteg som en aktiv brandpixel eller en icke-brandpixel. Den föreslagna metoden testas över 18 studieområden i olika regioner och ger en 0,804 F1-Score. Den överträffar VIIRS aktiva brandprodukter från NASA som har 0,663 F1-poäng. För daglig kartläggning av brända områden överträffar den också ackumuleringen av VIIRS aktiva brandhärdar i F1-poängen (0,811 mot 0,730). Transformer-modellen har också visat sig vara överlägsen för aktiv branddetektering jämfört med andra sekventiella GRU-modeller och rumsliga modeller som U-Net. Dessutom, för detektering av bränt område, visar den föreslagna AR-SwinUNETR också överlägsen prestanda jämfört med rumsliga modeller och andra baslinje-rums-temporala modeller.

För att komma till rätta med begränsningen av optiska bilder på grund av molntäcke utvärderas C-bBand-data från Sentinel-1 och RCM, samt L-bandsdata från ALOS-2 PALSAR-2, för detektering av bränt område efter brand. För att bedöma effektiviteten av SAR vid olika våglängder korsjämförs prestandan för samma djupinlärningsmodell på brända områden av varierande svårighetsgrad i löv- och barrskogar med hjälp av både Sentinel-1 SAR- och PALSAR-2 SAR-data. Resultaten indikerar att L-band SAR är känsligare för att detektera låga och medelhöga brännskador. Sammantaget uppnår modeller som använder L-bandsdata överlägsen prestanda, med ett F1-poäng på 0,840 och ett IoU-poäng på 0,729, jämfört med modeller som använder C-bandsdata, som fick 0,757 respektive 0,630 i 12 testskogsbränder. För RCM-data, som ger kompakt polarisering (compact-pol) vid C-bandet, förbättrar inkluderingen av funktioner genererade från m-$\chi$ kompakt polarisationsupplösning och radarvegetationsindex, i kombination med originalbilderna, prestandan ytterligare. Resultaten visar att utnyttjande av polarisationsnedbrytning och radarvegetationsindex förbättrar detekteringsnoggrannheten för baslinjemodeller för djupinlärning jämfört med att använda enbart kompakta polbilder. 

Sammanfattningsvis visar denna avhandling potentialen hos avancerade metoder för djupinlärning och jordobservationsdata med flera sensorer för att förbättra detektering av skogsbränder och kartläggning av brända områden, för att uppnå överlägsen prestanda över olika sensorer och metoder.

Place, publisher, year, edition, pages
KTH Royal Institute of Technology, 2024. p. 121
Series
TRITA-ABE-DLT ; 2430
Keywords
Wildfire, Remote Sensing, Active Fire Detection, Burned Area Mapping, GOES-R ABI, Suomi-NPP VIIRS, Sentinel-1, PALSAR-2, RADARSAT Constellation Mission, Image Segmentation, Deep Learning, Gated Recurrent Units (GRU), Transformer, Convolutional Neural Network., Vilda Bränder, Fjärranalys, Aktiv Branddetektering, Kartläggning av Bränt Område, GOES-R ABI, Suomi-NPP VIIRS, Sentinel-1, PALSAR-2, Bildsegmentering, Djupinlärning, Gated Recurrent Units (GRU), Transformer, Convolutional Neural Network.
National Category
Engineering and Technology
Research subject
Geodesy and Geoinformatics, Geoinformatics
Identifiers
urn:nbn:se:kth:diva-356334 (URN)978-91-8106-113-0 (ISBN)
Public defence
2024-12-06, https://kth-se.zoom.us/j/62299317578, Kollegiesalen, Brinellvägen 26, KTH Campus, Stockholm, 09:00 (English)
Opponent
Supervisors
Funder
Swedish Research Council Formas, H72100
Note

QC 20241118

Available from: 2024-11-18 Created: 2024-11-15 Last updated: 2024-12-04Bibliographically approved
Zhao, Y. (2023). Deep Learning for Active Fire Detection Using Multi-Source Satellite Image Time Series. (Licentiate dissertation). Stockholm: KTH Royal Institute of Technology
Open this publication in new window or tab >>Deep Learning for Active Fire Detection Using Multi-Source Satellite Image Time Series
2023 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

In recent years, climate change and human activities have caused increas- ing numbers of wildfires. Earth observation data with various spatial and temporal resolutions have shown great potential in detecting and monitoring wildfires. Advanced Baseline Imager (ABI) onboarding NOAA’s geostation- ary weather satellites Geostationary Operational Environmental Satellites R Series (GOES-R) can acquire images every 15 minutes at 2km spatial resolu- tion and has been used for early fire detection. Moderate Resolution Imaging Spectroradiometer (MODIS) and Visible Infrared Imaging Radiometer Suite (VIIRS) onboarding sun-synchronous satellites offer twice daily revisit and are widely used in active fire detection. VIIRS Active Fire product (VNP14IMG) has 375 m spatial resolution and MODIS Active Fire product (MCD14DL) has 1 km spatial resolution. While these products are very useful, the existing solutions have flaws, including many false alarms due to cloud cover or build- ings with roofs in high-temperature. Also, the multi-criteria threshold-based method does not leverage rich temporal information of each pixel at different timestamps and rich spatial information between neighbouring pixels. There- fore, advanced processing algorithms are needed to provide reliable detection of active fires. 

In this thesis, the main objective is to develop deep learning-based meth- ods for improved active fire detection, utilizing multi-sensor earth observation images. The high temporal resolution of the above satellites makes temporal information more valuable than spatial resolution. Therefore, sequential deep learning models like Gated Recurrent Unit (GRU), Long-Short Term Memory (LSTM), and Transformer are promising candidates for utilizing temporal in- formation encoded in the variation of the thermal band values. In this thesis, a GRU-based early fire detection method is proposed using GOES-R ABI time-series which shows earlier detection time of wildfires than VIIRS active fire product by NASA. In addition, a Transformer based method is proposed utilizing the Suomi National Polar-orbiting Partnership (Suomi-NPP) VIIRS time-series which shows better accuracy in active fire detection than VIIRS active fire product. 

The GRU-based GOES-R early detection method utilizes GOES-R ABI time-series which is composed of normalized difference between Mid Infra-red (MIR) Band 7 and Long-wave Infra-red Band 14. And Long-wave Infra-red Band 15 is used as the cloud mask. A 5-layer GRU network is proposed to process the time-series of each pixel and classify the active fire pixels at each time step. For 36 study areas across North America and South America, the proposed method detects 26 wildfires earlier than VIIRS active fire product. Moreover, the method mitigates the problem of coarse resolution of GOES- R ABI images by upsampling and the results show more reliable early-stage active fire location and suppresses the noise compared to GOES-R active fire product. 

For active fire detection utilizing the VIIRS time-series, a Transformer based solution is proposed. The VIIRS time-series images are tokenized into vectors of pixel time-series as the input to the proposed Transformer model. The attention mechanism of the Transformer helps to find the relations of the pixel at different time steps. By detecting the variation of the pixel values, the proposed model classifies the pixel at different time steps as an active fire pixel or a non-fire pixel. The proposed method is tested over 18 study areas across different regions and provides a 0.804 F1-Score. It outperforms the VIIRS active fire products from NASA which has a 0.663 F1-Score. Also, the Transformer model is proven to be superior for active fire detection to other sequential models like GRU (0.647 F1-Score) and LSTM (0.756 F1- Score). Also, both F1 scores and IoU scores of all sequential models indicate sequential models perform much better than spatial ConvNet models, for example, UNet (0.609 F1-Score) and Trans-U-Net (0.641 F1-Score). 

Future research is planned to explore the potential of both optical and SAR satellite data such as VIIRS, Sentinel-2, Landsat-8/9, Sentinel-1 C-band SAR and ALOS Phased Array L-band Synthetic Aperture Radar (PALSAR) for daily wildfire progression mapping. Advanced deep learning models, for example, Swin-Transformer and SwinUNETR will also be investigated to im- prove multi-sensor exploitation. 

Abstract [sv]

Under de senaste åren har klimatförändringar och mänsklig verksamhet orsakat allt fler skogsbränder. Jordobservationsdata med olika rumsliga och tidsmässiga upplösningar har visat sig ha stor potential när det gäller att upp- täcka och övervaka skogsbränder. Advanced Baseline Imager (ABI) på NO- AA:s geostationära vädersatelliter Geostationary Operational Environmental Satellites R Series (GOES-R) kan ta bilder var 15:e minut med en rumslig upplösning på 2 km och har använts för tidig upptäckt av bränder. Moderate Resolution Imaging Spectroradiometer (MODIS) och Visible Infrared Imaging Radiometer Suite (VIIRS) på solsynkrona satelliter har två gånger per dag och används ofta för att upptäcka aktiva bränder. VIIRS Active Fire product (VNP14IMG) har en geografisk upplösning på 375 m och MODIS Active Fire product (MCD14DL) har en geografisk upplösning på 1 km. Även om dessa produkter är mycket användbara har de befintliga lösningarna brister, bl.a. många falska larm på grund av molntäcke eller byggnader med tak med hög temperatur. Den multikriteriebaserade tröskelmetoden utnyttjar inte heller den rika tidsmässiga informationen för varje pixel vid olika tidpunkter och den rika rumsliga informationen mellan närliggande pixlar. Därför behövs avancerade bearbetningsalgoritmer för att tillförlitligt kunna upptäcka aktiva bränder. 

Huvudsyftet med den här avhandlingen är att utveckla metoder baserade på djupinlärning för förbättrad aktiv branddetektering med hjälp av jordob- servationsbilder med flera sensorer. Den höga tidsmässiga upplösningen hos ovannämnda satelliter gör att den tidsmässiga informationen är mer värdefull än den spatiala upplösningen. Därför är sekventiella djupinlärningsmodeller som Gated Recurrent Unit (GRU), Long-Short Term Memory (LSTM) och Transformer lovande kandidater för att utnyttja den tidsmässiga informatio- nen som är kodad i variationen av värmebandsvärdena. I den här avhand- lingen föreslås en GRU-baserad metod för tidig branddetektering med hjälp av GOES-R ABI-tidsserier som visar att skogsbränder upptäcks tidigare än VIIRS Active Fire Product från NASA. Dessutom föreslås en transforma- torbaserad metod som utnyttjar Suomi National Polar-orbiting Partnership (Suomi-NPP) VIIRS-tidsserier som visar bättre noggrannhet vid aktiv brand- detektering än VIIRS aktiva brandprodukt. 

Den GRU-baserade GOES-R-metoden för tidig upptäckt använder GOES- R ABI-tidsserier som består av den normaliserade skillnaden mellan MIR- bandet 7 (Mid Infra-Red) och 14 (Long-wave Infra-Red). Det långvågiga in- fraröda bandet 15 används som molnmask. Ett GRU-nätverk i fem lager fö- reslås för att bearbeta tidsserierna för varje pixel och klassificera de aktiva brandpixlarna vid varje tidssteg. För 36 undersökningsområden i Nord- och Sydamerika upptäcker den föreslagna metoden 26 skogsbränder tidigare än VIIRS-produkten för aktiva bränder. Dessutom mildrar metoden problemet med GOESR ABI-bildernas grova upplösning genom uppgradering, och resul- taten visar en mer tillförlitlig lokalisering av aktiva bränder i ett tidigt skede och undertrycker bruset jämfört med GOES-R:s produkt för aktiva bränder. 

För aktiv branddetektering med hjälp av VIIRS-tidsserier föreslås en transformatorbaserad lösning. VIIRS-tidsseriebilderna omvandlas till vektorer av pixeltidsserier som indata till den föreslagna transformatormodellen. Transformatorns uppmärksamhetsmekanism hjälper till att hitta relationen mellan pixlarna vid olika tidssteg. Genom att upptäcka variationen i pixel- värdena klassificerar den föreslagna modellen pixeln vid olika tidssteg som en aktiv brandpixel eller en icke-brandpixel. Den föreslagna metoden testas på 18 undersökningsområden i olika regioner och ger ett F1-värde på 0,804. Den överträffar VIIRS-produkterna för aktiva bränder från NASA som har en F1- poäng på 0,663. Transformatormodellen har också visat sig vara överlägsen för aktiv branddetektering jämfört med andra sekvensmodeller som GRU (0,647 F1-Score) och LSTM (0,756 F1-Score). Dessutom visar både F1-poäng och IoU-poäng för alla sekvensmodeller att sekventiella modeller presterar myc- ket bättre än spatiala ConvNet-modeller, till exempel UNet (0,609 F1-poäng) och Trans-U-Net (0,641 F1-poäng). 

Framtida forskning planeras för att utforska potentialen hos både optiska och SAR-satellitdata, t.ex. VIIRS, Sentinel-2, Landsat-8/9, Sentinel-1 C-band SAR och ALOS Phased Array L-band Synthetic Aperture Radar (PALSAR), för daglig kartläggning av skogsbränders utveckling. Avancerade modeller för djupinlärning, t.ex. Swin-Transformer och SwinUNETR, kommer också att undersökas för att förbättra utnyttjandet av flera sensorer. 

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2023. p. xv, 67
Series
TRITA-ABE-DLT ; 2315
Keywords
Wildfire, Remote Sensing, Active Fire Detection, GOES-R ABI, Suomi-NPP VIIRS, Image Segmentation, Deep Learning, Gated Recurrent Units (GRU), Transformer., Vilda Bränder, Fjärranalys, Aktiv Branddetektering, GOES- R ABI, Suomi-NPP VIIRS, Bildsegmentering, Djupinlärning, Gated Recurrent Units (GRU), Transformer
National Category
Earth Observation
Research subject
Geodesy and Geoinformatics, Geoinformatics
Identifiers
urn:nbn:se:kth:diva-327380 (URN)978-91-8040-529-4 (ISBN)
Presentation
2023-06-15, E53, Osquarsbacke 18, KTH Campus, video conference link [MISSING], Stockholm, 10:00 (English)
Opponent
Supervisors
Note

QC 20230526

Available from: 2023-05-26 Created: 2023-05-25 Last updated: 2025-12-16Bibliographically approved
Zhao, Y., Ban, Y. & Sullivan, J. (2023). Tokenized Time-Series in Satellite Image Segmentation With Transformer Network for Active Fire Detection. IEEE Transactions on Geoscience and Remote Sensing, 61, Article ID 4405513.
Open this publication in new window or tab >>Tokenized Time-Series in Satellite Image Segmentation With Transformer Network for Active Fire Detection
2023 (English)In: IEEE Transactions on Geoscience and Remote Sensing, ISSN 0196-2892, E-ISSN 1558-0644, Vol. 61, article id 4405513Article in journal (Refereed) Published
Abstract [en]

The Visible Infrared Imaging Radiometer Suite (VIIRS) onboard the Suomi National Polar-orbiting Partnership (Suomi-NPP) satellite has been used for the early detection and daily monitoring of active wildfires. How to effectively segment the active fire (AF) pixels from VIIRS image time-series in a reliable manner remains a challenge because of the low precision associated with high recall using automatic methods. For AF detection, multicriteria thresholding is often applied to both low-resolution and mid-resolution Earth observation images. Deep learning approaches based on convolutional neural networks (ConvNets) are also well-studied on mid-resolution images. However, ConvNet-based approaches have poor performance on low-resolution images because of the coarse spatial features. On the other hand, the high temporal resolution of VIIRS images highlights the potential of using sequential models for AF detection. Transformer networks, a recent deep learning architecture based on self-attention, offer hope as they have shown strong performance on image segmentation and sequential modeling tasks within computer vision. In this research, we propose a transformer-based solution to segment AF pixels from the VIIRS time-series. The solution feeds a time-series of tokenized pixels into a transformer network to identify AF pixels at each timestamp and achieves a significantly higher F1-score than prior approaches for AFs within the study areas in California, New Mexico, and Oregon in the U.S., and in British Columbia and Alberta in Canada, as well as in Australia, and Sweden.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc., 2023
Keywords
Active fire (AF) detection, image segmentation, remote sensing, transformer, Visible Infrared Imaging Radiometer Suite (VIIRS)
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:kth:diva-334367 (URN)10.1109/TGRS.2023.3287498 (DOI)001030654100010 ()2-s2.0-85162916865 (Scopus ID)
Note

QC 20230821

Available from: 2023-08-18 Created: 2023-08-18 Last updated: 2025-02-07Bibliographically approved
Gerard, S., Zhao, Y. & Sullivan, J. (2023). WildfireSpreadTS: A dataset of multi-modal time series for wildfire spread prediction. In: Advances in Neural Information Processing Systems 36 - 37th Conference on Neural Information Processing Systems, NeurIPS 2023: . Paper presented at 37th Conference on Neural Information Processing Systems, NeurIPS 2023, New Orleans, United States of America, Dec 10 2023 - Dec 16 2023. Neural Information Processing Systems Foundation
Open this publication in new window or tab >>WildfireSpreadTS: A dataset of multi-modal time series for wildfire spread prediction
2023 (English)In: Advances in Neural Information Processing Systems 36 - 37th Conference on Neural Information Processing Systems, NeurIPS 2023, Neural Information Processing Systems Foundation , 2023Conference paper, Published paper (Refereed)
Abstract [en]

We present a multi-temporal, multi-modal remote-sensing dataset for predicting how active wildfires will spread at a resolution of 24 hours.The dataset consists of 13 607 images across 607 fire events in the United States from January 2018 to October 2021.For each fire event, the dataset contains a full time series of daily observations, containing detected active fires and variables related to fuel, topography and weather conditions.The dataset is challenging due to: a) its inputs being multi-temporal, b) the high number of 23 multi-modal input channels, c) highly imbalanced labels and d) noisy labels, due to smoke, clouds, and inaccuracies in the active fire detection.The underlying complexity of the physical processes adds to these challenges.Compared to existing public datasets in this area, WILDFIRESPREADTS allows for multi-temporal modeling of spreading wildfires, due to its time series structure.Furthermore, we provide additional input modalities and a high spatial resolution of 375m for the active fire maps.We publish this dataset to encourage further research on this important task with multi-temporal, noise-resistant or generative methods, uncertainty estimation or advanced optimization techniques that deal with the high-dimensional input space.

Place, publisher, year, edition, pages
Neural Information Processing Systems Foundation, 2023
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-346140 (URN)001230083405038 ()2-s2.0-85191155663 (Scopus ID)
Conference
37th Conference on Neural Information Processing Systems, NeurIPS 2023, New Orleans, United States of America, Dec 10 2023 - Dec 16 2023
Note

QC 20240506

Available from: 2024-05-03 Created: 2024-05-03 Last updated: 2024-08-20Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-4230-2467

Search in DiVA

Show all publications