• Optics and Precision Engineering
  • Vol. 30, Issue 10, 1228 (2022)
Guanghui LIU1,*, Qinmeng WANG1, Xuanrun CHEN1,2, and Yuebo MENG1
Author Affiliations
  • 1School of Information and Control Engineering, Xi'an University of Architecture and Technology, Xi'an70055, China
  • 2Zhongke Xingtu Spatial Data Technology Co., Ltd., Xi'an710199, China
  • show less
    DOI: 10.37188/OPE.20223010.1228 Cite this Article
    Guanghui LIU, Qinmeng WANG, Xuanrun CHEN, Yuebo MENG. A multivariate information aggregation method for crowd density estimation and counting[J]. Optics and Precision Engineering, 2022, 30(10): 1228 Copy Citation Text show less

    Abstract

    In crowd density estimation, the crowd distribution and quantity in a crowded scene are counted, which is vital to safety systems and traffic control. A multivariate information aggregation method is proposed herein to solve difficult feature extractions, difficult spatial semantic information acquisitions, and insufficient feature fusions in the crowd density estimation of high-density images. First, a multi-information extraction network is designed, where VGG-19 is used as a skeleton network to enhance the depth of feature extraction, and a multilayer semantic surveillance strategy is adopted to encode low-level features to improve the semantic representation of low-level features. Second, a multiscale contextual information aggregation network is designed based on spatial information embedded into the high-level feature space, and two lightweight spatial pyramiding structures with step-size convolution are applied to reduce the redundancy of model parameters during global multiscale context information aggregation. Finally, step convolution is performed at the end of the network to accelerate the network operation without affecting the precision. The ShanghaiTech, UCF-QNRF, and NWPU datasets are applied for a comparison experiment. The experimental results demonstrate that the MAE and MSE of Part_A of the ShanghaiTech dataset are 59.4 and 96.2, respectively, whereas those of Part_B are 7.7 and 11.9, respectively. The ultradense multiview-scene UCF-QNRF dataset indicates an MAE and MSE of 89.3 and 164.5, respectively. The high-density NWPU dataset indicates an MAE and MSE of 87.9 and 417.2, respectively. The proposed method performs better than the comparison method, as indicated by actual application results.
    Guanghui LIU, Qinmeng WANG, Xuanrun CHEN, Yuebo MENG. A multivariate information aggregation method for crowd density estimation and counting[J]. Optics and Precision Engineering, 2022, 30(10): 1228
    Download Citation