Sparse multiple hypothesis matching and model lightweighting for infrared multi-object tracking

Changqi XU; Haoxian WANG; Jun WANG; Zhiquan ZHOU

doi:10.3788/IRLA20240373

Abstract

ObjectiveAlong with the continuous development of the marine economy, marine economic security is crucial. As an important research direction in marine intelligent sensing, ship multi-object tracking must consider high real-time and high accuracy. By quantitatively analyzing the matching process of the cost matrix, the sparse multi-hypothesis (SMH) matching algorithm is designed by combining the deep cascade matching (DCM) algorithm with the multiple hypothesis tracking (MHT), which improves the matching accuracy of the cost matrices while sparsifying the cost matrices and reducing the amount of computation in the matching process. Aiming at the problem of redundant parameters of the deep learning model, the YOLOv8s model is pruned using the layer-adaptive magnitude-based pruning (LAMP) algorithm to reduce the number of parameters and floating-point operations (FLOPs) of the model without reducing the accuracy of the model.MethodsFirst, the YOLOv8s model is pruned to obtain the YOLOv8s-prune model. The YOLOv8s-prune model is utilized to detect the objects for each frame in a given video sequence and obtain the object detection results. Secondly, the detection results and trajectory prediction positions are grouped based on their pseudo-depth information, which provides the basis for judging standard matching and ambiguous matching. Next, the similarity between the corresponding groups of detection results and trajectory prediction locations is calculated using IoU to obtain the cost matrix group. The hypothesis tree and standard cost matrix group are constructed by analyzing the rows and columns of each cost matrix within the cost matrix group. The hypothesis trees represent several hypotheses composed of the trajectory prediction positions and detection results in ambiguous matching. The standard cost matrix group records the matching costs between the trajectory-predicted positions and detection results in standard matching. Finally, the optimal object tracking result can be obtained by solving the hypothesis trees and the standard cost matrix group. The problem of solving the hypothesis tree is transformed into solving the maximum weighted independent set (MWIS) problem by calculating the trajectory score and solved by the Bron-Kerbosch algorithm; The standard cost matrix group is solved by the DCM algorithm from far to near based on the pseudo-depth information.Results and DiscussionsIn comparisons with other trackers, SMH outperforms other object trackers in most metrics, which shows that SMH is more robust and effective (Tab.1). In the experiment on speed between MHT and SMH, SMH is faster with lower computational complexity (Tab.2). In the comparison experiments between YOLOv8s and YOLOv8s-prune, the number of YOLOv8s-prune parameters and the amount of FLOPs decreased dramatically. However, the accuracy of YOLOv8s-prune does not decrease a lot and even improved a little (Tab.3). In ablation experiments, SMH outperforms baseline and YOLOv8s-prune does not have an enormous impact on the accuracy of the tracker (Tab.4). In the tracking visualization between BoT-SORT and SMH, SMH can follow the original track number without assigning new track number when re-tracking a lost trajectory (Fig.10). In the response visualization of feature map, for small objects, YOLOv8s-prune has a weaker response to the object; For large objects, YOLOv8s-prune has a weaker response to the object boundary (Fig.11). In the channels visualization between YOLOv8s and YOLOv8s-prune, the results show that many channels in the large convolution kernel contribute weakly to the accuracy of the model. The modules with fewer channels are preserved (Fig.12).ConclusionsA lightweight infrared multi-object tracking algorithm based on sparse multi-hypothesis matching is proposed to address the problems of ambiguous matching and the slow inference speed of deep learning-based object detectors. SMH combines the DCM algorithm with the MHT algorithm, outperforming existing multi-object tracking algorithms. Compared with YOLOv8s, the YOLOv8s-prune model obtained by the LAMP pruning algorithm has fewer parameters and floating-point operations. In the future, we plan to explore the temporal information in the video and combine it with the current spatial matching methods to improve the multi-object tracking effect.

微信扫一扫：分享

微信扫一扫：分享