Building Instance Segmentation from Aerial Imagery and LiDAR Point Clouds: A Case Study on the Opportunities and Challenges of Using AI for the Update of Cadastral Databases in Upplands Väsby Municipality
2025 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE credits
Student thesisAlternative title
Byggnadssegmentering från flygbilder och LiDAR-punktmoln : En fallstudie om möjligheter och utmaningar med att använda AI för uppdatering av byggnadsdatabaser i Upplands Väsby kommun (Swedish)
Abstract [en]
This thesis explores the potential of AI-based building detection to support building database maintenance and regulatory supervision in Upplands Väsby municipality. By combining stakeholder input from surveys and semi-structured interviews with technical evaluations of three segmentation models—SAM2, LangSAM, and Mask R-CNN—the study assesses both organizational readiness and model performance. Input data included aerial imagery and LiDAR point clouds, and various geometric regularization techniques were tested. Performance was measured using mean IoU, precision, recall, and F1-score.
Findings highlight both organizational and technical challenges. While stakeholders expressed interest in AI adoption, concerns around GDPR compliance, limited internal expertise, resource constraints, and inconsistent data quality emerged as significant barriers. From a technical perspective, none of the models consistently achieved the precision and reliability needed for fully automated use. Performance varied based on urban context, input data, and regularization methods. Mask R-CNN showed the best overall performance, particularly when considering scalability, but still suffered from low recall. SAM2 offered high segmentation precision on individual buildings with point-prompts but lacked scalability. LangSAM was more scalable through text-prompts but struggled with consistency and recall at larger scales. For change detection, the best results were obtained by combining outputs from LangSAM and Mask R-CNN.
Overall, the evaluated models are more suitable for lower-precision internal tasks, such as general land use mapping. For high-precision applications, further model refinement, improved data quality, and seamless system integration are essential. With targeted development, AI segmentation tools like Mask R-CNN and SAM2 could play a valuable role in future municipal workflows.
Place, publisher, year, edition, pages
2025.
Series
TRITA-ABE-MBT ; 25654
Keywords [en]
building footprint extraction, instance segmentation, Segment Anything Model, Mask R-CNN, VHR imagery, LiDAR, remote sensing, artificial intelligence
Keywords [sv]
byggnadsavtryck, objektsegmentering, Segment Anything Model, Mask R-CNN, VHR-bilder, LiDAR, fjärranalys, artificiell intelligens
National Category
Earth Observation
Identifiers
URN: urn:nbn:se:kth:diva-369983OAI: oai:DiVA.org:kth-369983DiVA, id: diva2:1998592
External cooperation
Upplands Väsby kommun
Presentation
2025-06-13, 00:00 (English)
Supervisors
Examiners
2025-09-172025-09-172025-09-17Bibliographically approved