• Laser & Optoelectronics Progress
  • Vol. 60, Issue 12, 1228011 (2023)
Xingbo Han1,2 and Fan Li1,2,*
Author Affiliations
  • 1Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650504, Yunnan, China
  • 2Yunnan Key Laboratory of Artificial Intelligence, Kunming 650504, Yunnan, China
  • show less
    DOI: 10.3788/LOP221744 Cite this Article Set citation alerts
    Xingbo Han, Fan Li. Remote Sensing Small Object Detection Based on Cross-Layer Attention Enhancement[J]. Laser & Optoelectronics Progress, 2023, 60(12): 1228011 Copy Citation Text show less
    Details of YOLOv5 network
    Fig. 1. Details of YOLOv5 network
    Overall structure of the proposed model
    Fig. 2. Overall structure of the proposed model
    ResCatPAN structure
    Fig. 3. ResCatPAN structure
    ResCat structure
    Fig. 4. ResCat structure
    Overall structure of the cross-layer attention. (a) Complete flow of the cross-layer attention; (b) catt module of the cross-layer attention; (c) satt module of the cross-layer attention
    Fig. 5. Overall structure of the cross-layer attention. (a) Complete flow of the cross-layer attention; (b) catt module of the cross-layer attention; (c) satt module of the cross-layer attention
    Distribution statistics of label boxes in the data set. (a) Distribution of the size of the label box; (b) distribution of the center points of the label box
    Fig. 6. Distribution statistics of label boxes in the data set. (a) Distribution of the size of the label box; (b) distribution of the center points of the label box
    Effect analysis on hyperparameter γ. (a) Influence of hyperparameter γ on detection performance for small object; (b) influence of hyperparameter γ on detection performance for medium object; (c) influence of hyperparameter γ on detection performance for large object; (d) influence of hyperparameter γ on detection performance for object
    Fig. 7. Effect analysis on hyperparameter γ. (a) Influence of hyperparameter γ on detection performance for small object; (b) influence of hyperparameter γ on detection performance for medium object; (c) influence of hyperparameter γ on detection performance for large object; (d) influence of hyperparameter γ on detection performance for object
    Heat map generated by the proposed CLAT module by using the Grad-CAM method
    Fig. 8. Heat map generated by the proposed CLAT module by using the Grad-CAM method
    Effect contrast of the proposed model and baseline on the test set. (a) baseline; (b) proposed model
    Fig. 9. Effect contrast of the proposed model and baseline on the test set. (a) baseline; (b) proposed model
    MethodBackboneAPs /%mAP /%
    SSDVGG1658.6
    YOLOv3Darknet-5311.657.1
    Faster R-CNN with FPNResNet‐10165.1
    Mask R-CNN with FPNResNet‐10165.2
    Libra R-CNNResNet‐10114.979.7
    Dynamic R-CNNResNet5012.177.3
    YOLOxDarknet-5317.385.7
    YOLOv5x6CSPDark-5319.986.6
    Proposed methodCSPDark-5323.486.4
    Table 1. Contrast experiment
    BaselineRCPCLATmAP /%APs /%APm /%APl /%
    85.517.552.075.6
    85.421.651.775.4
    86.219.252.276.1
    86.423.452.375.9
    Table 2. Ablation experiment
    ClassNumber of samplesClassNumber of samplesClassNumber of samplesClassNumber of samples
    golf-field881dam824stadium1003airport1071
    train-station811chimney1340harbor4447bridge3161
    expressway-service-area1743basketball-court2658overpass2478airplane8100
    expressway-toll-station1028ground-track-field2390windmill4371vehicle32180
    baseball-field4674tennis-court9621storage-tank20717ship50207
    Table 3. Sample distribution of the DIOR dataset
    ModelBackboneTime /s
    baselineCSPDark-530.06
    Proposed modelCSPDark-530.09
    Table 4. Time-consuming comparison