• Optics and Precision Engineering
  • Vol. 29, Issue 9, 2247 (2021)
Jun-ying CHEN*, Tong-yao BAI, and Liang ZHAO
Author Affiliations
  • School of Information and Control Engineering, Xi'an University of Architecture and T echnology, Xi'an710055, China
  • show less
    DOI: 10.37188/OPE.20212909.2247 Cite this Article
    Jun-ying CHEN, Tong-yao BAI, Liang ZHAO. 3D object detection based on fusion of point cloud and image by mutual attention[J]. Optics and Precision Engineering, 2021, 29(9): 2247 Copy Citation Text show less

    Abstract

    To use image information in assisting point cloud to improve the accuracy of 3D object detection, it is necessary to solve the problem of the adaptive alignment and fusion between the image feature space and point cloud feature space. A deep learning network based on adaptive fusion of multimodal features was proposed for 3D object detection. First, a voxelization method was used to partition point clouds into even voxels. The voxel feature was derived from the features of the point cloud included, and a 3D sparse convolution neural network was used to learn the features of the point cloud. Simultaneously, a ResNet-like neural network was used to extract the image features. Next, the image features and point cloud features were aligned adaptively by introducing the mutual attention module, and the point cloud features enhanced by the image feature were obtained. Finally, based on the derived features, Region Proposal Networks (RPN) and multitask learning networks for classification and regression tasks were applied to achieve 3D object detection. The experimental results on the KITTI 3D object detection data set showed that the average precision was 88.76%, 77.63%, and 76.14%, respectively on simple, medium, and difficult levels of car detection. The proposed method can effectively fuse image and point cloud information, and improve the precision of 3D object detection.
    Jun-ying CHEN, Tong-yao BAI, Liang ZHAO. 3D object detection based on fusion of point cloud and image by mutual attention[J]. Optics and Precision Engineering, 2021, 29(9): 2247
    Download Citation