A multivariate information aggregation method for crowd density estimation and counting

Guanghui LIU; Qinmeng WANG; Xuanrun CHEN; Yuebo MENG

doi:10.37188/OPE.20223010.1228

Abstract

In crowd density estimation， the crowd distribution and quantity in a crowded scene are counted， which is vital to safety systems and traffic control. A multivariate information aggregation method is proposed herein to solve difficult feature extractions， difficult spatial semantic information acquisitions， and insufficient feature fusions in the crowd density estimation of high-density images. First， a multi-information extraction network is designed， where VGG-19 is used as a skeleton network to enhance the depth of feature extraction， and a multilayer semantic surveillance strategy is adopted to encode low-level features to improve the semantic representation of low-level features. Second， a multiscale contextual information aggregation network is designed based on spatial information embedded into the high-level feature space， and two lightweight spatial pyramiding structures with step-size convolution are applied to reduce the redundancy of model parameters during global multiscale context information aggregation. Finally， step convolution is performed at the end of the network to accelerate the network operation without affecting the precision. The ShanghaiTech， UCF-QNRF， and NWPU datasets are applied for a comparison experiment. The experimental results demonstrate that the MAE and MSE of Part_A of the ShanghaiTech dataset are 59.4 and 96.2， respectively， whereas those of Part_B are 7.7 and 11.9， respectively. The ultradense multiview-scene UCF-QNRF dataset indicates an MAE and MSE of 89.3 and 164.5， respectively. The high-density NWPU dataset indicates an MAE and MSE of 87.9 and 417.2， respectively. The proposed method performs better than the comparison method， as indicated by actual application results.

微信扫一扫：分享

微信扫一扫：分享