• Laser & Optoelectronics Progress
  • Vol. 61, Issue 18, 1837010 (2024)
Liangzi Wang1,2,3, Miaohua Huang1,2,3,*, Ruoying Liu1,2,3, Chengcheng Bi1,2,3, and Yongkang Hu1,2,3
Author Affiliations
  • 1Hubei Key Laboratory of Advanced Technology for Automotive Components, Wuhan University of Technology, Wuhan 430070, Hubei, China
  • 2Hubei Collaborative Innovation Center for Automotive Components Technology, Wuhan University of Technology, Wuhan 430070, Hubei, China
  • 3Hubei Research Center for New Energy & Intelligent Connected Vehicle, Wuhan University of Technology, Wuhan 430070, Hubei, China
  • show less
    DOI: 10.3788/LOP240516 Cite this Article Set citation alerts
    Liangzi Wang, Miaohua Huang, Ruoying Liu, Chengcheng Bi, Yongkang Hu. Improved Two-Stage 3D Object Detection Algorithm for Roadside Scenes with Enhanced PointPillars and Transformer[J]. Laser & Optoelectronics Progress, 2024, 61(18): 1837010 Copy Citation Text show less

    Abstract

    This study proposes a two-stage three-dimensional object detection algorithm tailored for roadside scenes, aiming to address the challenges of high missed detection rates for long-distance vehicles and high false detection rates for pedestrians in complex scenes involved in cloud object detection tasks. This algorithm improves PointPillars and Transformer. In the first stage of the algorithm, the PointPillars-based backbone network incorporates the SimAM attention mechanism to capture similarity information, prioritizing essential features. This stage replaces standard convolutional blocks in the downsampling section with residual structures to improve network performance. The second stage of the algorithm utilizes Transformer to refine the candidate boxes generated in the first stage: the encoder constructs the original point features for encoding, while the decoder employs channel weighting to enhance channel information, thereby enhancing detection accuracy and mitigating false detection. The effectiveness of the proposed algorithm was tested on the DAIR-V2X-I roadside dataset and the KITTI vehicle-end dataset. Experimental results demonstrated substantial improvements in detection accuracy over other publicly available algorithms. Compared with the benchmark algorithm PointPillars, for moderate detection difficulty, accuracy improvements in detecting cars, pedestrians, and cyclists on the DAIR-V2X-I dataset were 1.9 percentage points, 10.5 percentage points, and 2.11 percentage points, respectively. Moreover, corresponding improvements on the KITTI dataset were 2.34 percentage points, 4.73 percentage points, and 8.17 percentage points, respectively.
    Liangzi Wang, Miaohua Huang, Ruoying Liu, Chengcheng Bi, Yongkang Hu. Improved Two-Stage 3D Object Detection Algorithm for Roadside Scenes with Enhanced PointPillars and Transformer[J]. Laser & Optoelectronics Progress, 2024, 61(18): 1837010
    Download Citation