• Laser & Optoelectronics Progress
  • Vol. 59, Issue 22, 2228005 (2022)
Xulun Liu1, Shiping Ma1, Linyuan He1,2,*, Chen Wang1..., Xu He1 and Zhe Chen3|Show fewer author(s)
Author Affiliations
  • 1School of Aeronautical Engineering, Air Force Engineering University, Xi'an 710038, Shaanxi, China
  • 2Unbanned System Research Institute, Northwestern Polytechnical University, Xi'an 710072, Shaanxi, China
  • 3School of Cyberspace Security, Xi'an University of Posts & Telecommunications, Xi'an 710121, Shaanxi, China
  • show less
    DOI: 10.3788/LOP202259.2228005 Cite this Article Set citation alerts
    Xulun Liu, Shiping Ma, Linyuan He, Chen Wang, Xu He, Zhe Chen. Target Detection Method for Remote Sensing Images Based on Sparse Mask Transformer[J]. Laser & Optoelectronics Progress, 2022, 59(22): 2228005 Copy Citation Text show less
    Structure diagram of the proposed network
    Fig. 1. Structure diagram of the proposed network
    Schematic of attention block. MSA is multi-head attention, SI-MSA is sparse-interpolation multi-head attention.(a) Standard attention block; (b) sparse-interpolation attention block
    Fig. 2. Schematic of attention block. MSA is multi-head attention, SI-MSA is sparse-interpolation multi-head attention.(a) Standard attention block; (b) sparse-interpolation attention block
    Sparse-interlation multi-head self-attention
    Fig. 3. Sparse-interlation multi-head self-attention
    Deterministic sampling and stochastic sampling. (a) Deterministic sampling; (b) stochastic sampling
    Fig. 4. Deterministic sampling and stochastic sampling. (a) Deterministic sampling; (b) stochastic sampling
    Visualization of parts of the detection results
    Fig. 5. Visualization of parts of the detection results
    MethodAP /%mAP /%
    PLBDBRGTFSVLVSHTCBCSTSBFRAHASPHC
    R2CNN1980.9465.6735.3467.4459.9250.9155.8190.6766.9272.3955.0652.2355.1453.3548.2260.67
    RRPN2088.5271.2031.6659.3051.8556.1957.2590.8172.8467.3856.6952.8453.0851.9453.5861.01
    RT2188.6478.5243.4475.9268.8173.6883.5990.7477.2781.4658.3953.5462.8358.9347.6469.56
    CAD-Net287.8082.4049.4073.5071.1064.5076.6090.9079.2073.3048.4060.9062.0067.0062.2069.90
    SCRDet2289.9880.6552.0968.3668.3660.3272.4190.8587.9486.8665.0266.6866.2568.2465.2172.61
    GV389.6485.0052.2677.3473.0173.1486.8290.7479.0286.8159.5570.9172.9470.8657.3275.02
    BBAVectors2388.6384.0652.1369.5678.2680.4088.0690.8787.2386.3956.1165.5267.1072.0863.9675.36
    Proposed method89.1484.4054.7376.8079.2182.0189.2391.3486.0588.5468.6569.9070.8374.2771.3778.43
    Table 1. Comparison of detection accuracy of different methods in DOTA dataset
    ModelBackbonemAP /%Speed /(frame·s-1
    R2CNNVGG-1660.675.9
    RRPNVGG-1661.017.2
    RTR101-FPN69.567.8
    CAD-NetR101-FPN69.907.9
    SCRDetR101-FPN72.618.4
    GVR101-FPN75.0211.6
    BBAVectorsResNet-10175.3613.7
    Proposed methodResNet-10178.4312.5
    Table 2. mAP value and detection speed of different detection methods on DOTA dataset
    BaselineMulti-scale inputSampling moduleInterpolation moduleEpochGFLOPsmAP /%
    5015265.33
    50015276.41
    50189067.18
    500189078.23
    5013877.86
    5014078.43
    Table 3. Ablation study
    Xulun Liu, Shiping Ma, Linyuan He, Chen Wang, Xu He, Zhe Chen. Target Detection Method for Remote Sensing Images Based on Sparse Mask Transformer[J]. Laser & Optoelectronics Progress, 2022, 59(22): 2228005
    Download Citation