• Optics and Precision Engineering
  • Vol. 32, Issue 6, 901 (2024)
Minjia CHEN1,2, Shaoyan GAI1,2,*, Feipeng DA1,2, and Jian YU1,2,3,*
Author Affiliations
  • 1School of Automation, Southeast University, Nanjing20096, China
  • 2Key Laboratory of Measurement and Control of Complex Engineering Systems, Ministry of Education, Southeast University, Nanjing10096, China
  • 3Key Laboratory of Space Photoelectric Detection and Perception, Ministry of Industry and Information Technology, Nanjing University of Aeronautics and Astronautics, Nanjing211106, China
  • show less
    DOI: 10.37188/OPE.20243206.0901 Cite this Article
    Minjia CHEN, Shaoyan GAI, Feipeng DA, Jian YU. Object 6-DoF pose estimation using auxiliary learning[J]. Optics and Precision Engineering, 2024, 32(6): 901 Copy Citation Text show less

    Abstract

    In order to accurately estimate the position and pose of an object in the camera coordinate system in challenging scenes with severe occlusion and scarce texture, while also enhancing network efficiency and simplifying the network architecture, this paper proposed a 6-DoF pose estimation method using auxiliary learning based on RGB-D data. The network took the target object image patch, corresponding depth map, and CAD model as inputs. First, a dual-branch point cloud registration network was used to obtain predicted point clouds in both the model space and the camera space. Then, for the auxiliary learning network, the target object image patch and the Depth-XYZ obtained from the depth map were input to the multi-modal feature extraction and fusion module, followed by coarse-to-fine pose estimation. The estimated results were used as priors for optimizing the loss calculation. Finally, during the performance evaluation stage, the auxiliary learning branch was discarded and only the outputs of the dual-branch point cloud registration network are used for 6-DoF pose estimation using point pair feature matching. Experimental results indicate that the proposed method achieves AUC of 95.9% and ADD-S<2 cm of 99.0% in the YCB-Video dataset; ADD(-S) result of 99.4% in the LineMOD dataset; and ADD(-S) result of 71.3% in the LM-O dataset. Compared with existing 6-DoF pose estimation methods, our method using auxiliary learning has advantages in terms of model performance and significantly improves pose estimation accuracy.
    Minjia CHEN, Shaoyan GAI, Feipeng DA, Jian YU. Object 6-DoF pose estimation using auxiliary learning[J]. Optics and Precision Engineering, 2024, 32(6): 901
    Download Citation