DSFPAP-Net: Deeper and Stronger Feature Path Aggregation Pyramid Network for Object Detection in Remote Sensing Images
2024 (English)In: IEEE Geoscience and Remote Sensing Letters, ISSN 1545-598X, E-ISSN 1558-0571, Vol. 21, article id 6010505Article in journal (Refereed) Published
Abstract [en]
Rapid detection of small objects in remote sensing (RS) images is crucial for intelligence acquisition, for instance, enemy ship detection. Instead of employing images with high resolution, low-resolution images of the same size typically cover a wider area and thus facilitate efficient object detection. However, accurately detecting small objects in such images remains a challenge due to their limited visual information and the difficulty in distinguishing them from the background. To address this issue, we propose a small object detection method called the deeper and stronger feature path aggregation pyramid network (DSFPAP-Net) for low-resolution RS images. First, our approach involves designing aggregation networks with deeper paths and utilizing feature layers closer to the shallow layers to enhance the acquisition of information about small objects. Second, to enhance the network's focus on small objects, we propose a resolution-adjustable 3-D weighted attention (RA3-DWA) mechanism. This mechanism enables independent learning of spatial feature information and assigns 3-D weights specifically to small objects, resulting in improved detection accuracy for small objects. Finally, we propose the Fast-EIoU loss function to accelerate the regression of the model boundary. This loss function assigns an acceleration factor to the length loss and width loss, respectively, thereby improving the detection accuracy of small objects. Experiments on Levir-Ship and DOTA demonstrate the effectiveness and efficiency of the proposed method. Compared to the baseline YOLOv5, our method has improved the average detection accuracy of the Levir-Ship dataset by 6.7% (reaching up to 82.6%) and the accuracy of the DOTA dataset by 6.4% (reaching up to 73.7%).
Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE) , 2024. Vol. 21, article id 6010505
Keywords [en]
Remote sensing, Feature extraction, Image resolution, YOLO, Semantics, Kernel, Convolution, 3-D attention, fast-EIoU loss function, low-resolution remote sensing images, small objects
National Category
Computer graphics and computer vision
Identifiers
URN: urn:nbn:se:kth:diva-350124DOI: 10.1109/LGRS.2024.3398727ISI: 001248303400019Scopus ID: 2-s2.0-85192789805OAI: oai:DiVA.org:kth-350124DiVA, id: diva2:1882887
Note
QC 20240708
2024-07-082024-07-082025-02-07Bibliographically approved