Lightweight Small Object Detection Algorithm Based on STD-DETR

Zeyu Yin; Bo Yang; Jinling Chen; Chuangchuang Zhu; Hongli Chen; Jin Tao

doi:10.3788/LOP241849

Abstract

To address the challenges of small target detection in aerial photography images by unmanned aerial vehicle, including complex background, tiny and dense targets, and difficulties in deploying models on mobile devices, this paper proposes an improved lightweight small target detection algorithm based on real-time DEtection TRansformer (RT-DETR) model, named STD-DETR. First, RepConv is introduced to improve the lightweight Starnet network, replacing the original backbone network, thereby achieving lightweight. A novel feature pyramid is then designed, incorporating a 160 pixel × 160 pixel feature map output at the P2 layer to enrich small target information. This approach replaces the traditional method of adding a P2 small target detection head, and introduces the CSP-ommiKernel-squeeze-excitation (COSE) module and space-to-depth (SPD) convolution to enhance the extraction of global features and the fusion of multi-scale features. Finally, pixel intersection over union (PIoU) is used to replace the original model's loss function, calculating IoU at the pixel level to more precisely capture small overlapping regions, reducing the miss rate and improving detection accuracy. Experimental results demonstrate that, compared with baseline model, the STD-DETR model achieves improvements of 1.3 percentage points, 2.2 percentage points, and 2.3 percentage points in accuracy, recall, and mAP₅₀ on the VisDrone2019 dataset, while reducing computational cost and parameters by ~34.0% and ~37.9%, respectively. Generalization tests on the Tinyperson dataset show increases of 3.7 percentage points in accuracy and 3.1 percentage points in mAP₅₀, confirming the model's effectiveness and generalization capability.

微信扫一扫：分享

微信扫一扫：分享