Multi-object tracking (MOT) is identified as a critical issue in computer vision, where specific target categories are detected and tracked in every frame of image sequences. These techniques are widely applied in areas such as sky warning, safety monitoring, autonomous driving, and video analysis. Recent MOT research focuses on the track-by-detection (TBD) paradigm, where detection in each frame becomes a data association task. This approach, which often employs appearance and motion embeddings for bipartite graph matching, is preferred by researchers due to the effectiveness of high-performance object detection models.
A multi-object tracking method based on quasi dense similarity learning (QDSL) is proposed to tackle accuracy challenges stemming from clutter, occlusions, and similar object appearances. The improved YOLOv8 framework is utilized for multi-object detection, generating in bounding box outputs. Subsequently, dense sampling of object regions across adjacent frames is performed for contrastive learning to facilitate local data association. Ultimately, the integration of similarity learning with YOLOv8 is implemented, alongside an appearance-free link model, facilitating global association without dependence on appearance features. This approach effectively balances tracking speed and accuracy.
The effectiveness of the proposed algorithm is evaluated through component ablation experiments conducted on the MOT17 dataset video sequences, with results summarized in Table 2. The experiments indicate a significant improvement, with a 0.7% increase in multiple object tracking accuracy (MOTA) and a 0.3% increase in identity F1 score , demonstrating enhanced tracking performance as shown in Fig. 8. Comparative experiments, detailed in Table 4, demonstrate the algorithm notable advantages over three other tracking techniques, underscoring its superior performance. Furthermore, airplane tracking tests showcase the algorithm robustness and ability to handle images with varying sizes and angles effectively, with results presented in Fig. 9.
The proposed multi-object tracking method leverages an improved version of YOLOv8 along with QDSL to tackle challenges related to occlusion and similar object appearances. By using QDSL, the object features are developed with greater accuracy, enhancing both the stability of local correlations and their reliability. Continuous trajectory updates with Gaussian smooth interpolation are achieved, validating the model applicability in complex multi-target tracking scenarios. Extensive comparative experiments and tracking tests demonstrate the model superior adaptability to real-world situations.