• Optics and Precision Engineering
  • Vol. 31, Issue 19, 2884 (2023)
Zifen HE, Lin XU, Yinhui ZHANG*, and Ying HUANG
Author Affiliations
  • Faculty of Mechanical and Electrical Engineering, Kunming University of Science and Technology, Kunming650500, China
  • show less
    DOI: 10.37188/OPE.20233119.2884 Cite this Article
    Zifen HE, Lin XU, Yinhui ZHANG, Ying HUANG. Mask generation dynamically regulates weakly supervised video instance segmentation[J]. Optics and Precision Engineering, 2023, 31(19): 2884 Copy Citation Text show less

    Abstract

    The training data of fully supervised video instance segmentation networks are highly dependent on accurate mask annotations under high labor and time costs, owing to which intelligent machines are unable to quickly adapt to new scenes. Therefore, a mask generation, dynamically regulated weakly supervised video instance segmentation (WSVIS) network was proposed. First, to overcome the loss of instance activation features caused by the sudden dimension drop of the initial mask prediction layer channel, a multi-level feature fusion module was used to predict the initial instance features through a step-by-step feature reuse strategy and to generate the initial mask by fusing the relative position information. Second, a dynamic regulation mechanism was introduced to establish mask feature dependencies in the channel and spatial dimensions to strengthen the dynamic interaction between the initial predicted mask and instance-aware information. Finally, the network replaces fine mask labeling with the binary color similarity of images, and the bounding box consistency loss and supervised video instance segmentation mask were replaced with bounding box labeling only. Experimental results reveal that on the BoxSet and YT-VIS datasets, the WSVIS network achieves similar segmentation accuracy and segmentation effect as the fully supervised network and can satisfy real-time reasoning, providing theoretical support and an algorithmic basis for intelligent machines to quickly adapt to new scenes to realize real-time environmental perception and understanding.
    Zifen HE, Lin XU, Yinhui ZHANG, Ying HUANG. Mask generation dynamically regulates weakly supervised video instance segmentation[J]. Optics and Precision Engineering, 2023, 31(19): 2884
    Download Citation