• Acta Optica Sinica
  • Vol. 45, Issue 5, 0515001 (2025)
Ying Zhang, Hongzhi Du, Yunbo Hu, Yanbiao Sun*, and Jigui Zhu
Author Affiliations
  • National Key Laboratory of Precision Measuring Technology and Instruments, Tianjin University, Tianjin 300072, China
  • show less
    DOI: 10.3788/AOS241716 Cite this Article Set citation alerts
    Ying Zhang, Hongzhi Du, Yunbo Hu, Yanbiao Sun, Jigui Zhu. Multi‑Instance Point Cloud Pose Estimation Method Based on Gaussian‐Weighted Voting Strategy[J]. Acta Optica Sinica, 2025, 45(5): 0515001 Copy Citation Text show less

    Abstract

    Objective

    Object pose estimation is widely applied in fields such as robot grasping, augmented reality, and autonomous navigation. However, complex and unstructured environments present significant challenges, such as scattered stacking and mutual occlusion of objects, complicating the accurate 6D pose estimation of target objects. Traditional point cloud-based pose estimation methods typically determine the optimal transformation matrix between two point clouds. In real-world scenarios, stacked objects often have simple structures and lack distinctive features, making it difficult to accurately extract corresponding points. As a result, traditional methods are generally limited to single-target pose estimation and struggle to address multi-instance, multi-target pose estimation, where the model point cloud must align with multiple instances in the target scene. This is further complicated by the unknown number of instances and occlusion between them.

    Methods

    In this paper, we introduce a multi-instance point cloud pose estimation method leveraging a Gaussian-weighted voting strategy. To address the challenges of occlusions and the lack of distinctive features in stacked workpieces, which often result in biased voting and inaccurate corresponding points, we propose a Gaussian-weighted voting approach for generating corresponding points. Initially, the point pair features of the model point cloud are extracted. The angular distribution between point pair normal vectors is Gaussian-fitted to calculate weight coefficients, enabling more accurate voting based on the angular relationships between normal vectors. This results in a refined set of corresponding points and an initial pose estimation set. To achieve multi-instance pose estimation, we introduce a clustering and optimization method based on distance invariance. A distance invariance matrix is constructed from the corresponding point set, and feature vector similarities are calculated to efficiently cluster multiple instances. Redundant poses are filtered out through refined clustering centered around the instances, while incorrect poses are eliminated by evaluating point cloud overlap. Final pose optimization is performed using the iterative closest point (ICP) algorithm.

    Results and Discussions

    To evaluate the proposed method’s effectiveness, both simulation and real-world robotic sorting experiments are conducted. In simulation tests, mean recall (MR), mean precision (MP) and mean F1 score (MF) serve as evaluation metrics. On the Romain dataset, the Gaussian-weighted voting strategy improves the accuracy of corresponding points and the initial pose set. Compared to the PPF method, the proposed approach reduces the average rotation error (RE) by 1.58° and translation error (TE) by 0.55 mm (Table 1). For multi-instance pose estimation on the Romain and ROBI datasets, the proposed method achieves MF of 16.56 percentage points (Fig. 9, Table 2) and 15.39 percentage points (Fig. 11, Table 3) higher than the best comparison methods, respectively. Real-world tests show RMSE values within 3 mm, with a minimum of 1.13 mm. Average MR, MP, and MF values are 61.07%, 68.97%, and 64.57%, respectively (Fig. 13, Table 4). Robotic sorting experiments achieve a 93% success rate, significantly outperforming the 60% success rate of the point pair feature (PPF) algorithm (Fig. 12, Table 5).

    Conclusions

    In this paper, we address the challenges of pose estimation errors caused by object occlusion and simple features in stacked scenes by proposing a multi-instance point cloud pose estimation method based on a Gaussian-weighted voting strategy. To address biased voting caused by simple workpiece features, the angular distribution between normal vectors of point pairs is Gaussian-fitted to refine the voting process, improving the accuracy of corresponding points and the initial pose estimation set. To enable multi-instance point cloud pose estimation and distinguish correspondences between different instances, a distance invariance matrix is constructed to efficiently cluster multiple instances. Redundant poses are filtered through refined clustering centered on instance points, while incorrect poses are eliminated through screening based on the point cloud overlap rate. Final pose optimization is achieved using the ICP algorithm. Simulation and robotic arm sorting experiments demonstrate the robustness and effectiveness of the proposed method in managing multi-instance recognition and pose estimation in stacked scenes. The approach shows significant potential for application in robotic arm automatic sorting systems within unstructured stacking environments.

    Ying Zhang, Hongzhi Du, Yunbo Hu, Yanbiao Sun, Jigui Zhu. Multi‑Instance Point Cloud Pose Estimation Method Based on Gaussian‐Weighted Voting Strategy[J]. Acta Optica Sinica, 2025, 45(5): 0515001
    Download Citation