首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >Learning Multi-Level Density Maps for Crowd Counting
【24h】

Learning Multi-Level Density Maps for Crowd Counting

机译:学习人群计数的多级密度图

获取原文
获取原文并翻译 | 示例
           

摘要

People in crowd scenes often exhibit the characteristic of imbalanced distribution. On the one hand, people size varies largely due to the camera perspective. People far away from the camera look smaller and are likely to occlude each other, whereas people near to the camera look larger and are relatively sparse. On the other hand, the number of people also varies greatly in the same or different scenes. This article aims to develop a novel model that can accurately estimate the crowd count from a given scene with imbalanced people distribution. To this end, we have proposed an effective multi-level convolutional neural network (MLCNN) architecture that first adaptively learns multi-level density maps and then fuses them to predict the final output. Density map of each level focuses on dealing with people of certain sizes. As a result, the fusion of multi-level density maps is able to tackle the large variation in people size. In addition, we introduce a new loss function named balanced loss (BL) to impose relatively BL feedback during training, which helps further improve the performance of the proposed network. Furthermore, we introduce a new data set including 1111 images with a total of 49 061 head annotations. MLCNN is easy to train with only one end-to-end training stage. Experimental results demonstrate that our MLCNN achieves state-of-the-art performance. In particular, our MLCNN reaches a mean absolute error (MAE) of 242.4 on the UCF_CC_50 data set, which is 37.2 lower than the second-best result.
机译:人群场景中的人经常表现出分布不平衡的特征。一方面,由于相机的角度,人们的大小在很大程度上变化。远离相机的人看起来更小,很可能会互相遮挡,而靠近相机的人看起来更大,并且相对稀疏。另一方面,人数也在相同或不同的场景中变化。本文旨在开发一种新型模型,可以准确地估计来自特定场景的人群计数,人们分发不平衡。为此,我们提出了一种有效的多级卷积神经网络(MLCNN)架构,首先自适应地学习多级密度映射,然后使其熔化以预测最终输出。每个级别的密度图侧重于处理某些尺寸的人。结果,多级密度图的融合能够解决人们大小的大变化。此外,我们介绍了一个名为均衡丢失(BL)的新损失函数,以在训练期间强加相对的BL反馈,这有助于进一步提高所提出的网络的性能。此外,我们介绍了一个新的数据集,包括1111个图像,总共49个061个头注释。 MLCNN易于培训,只能用一个端到端的训练阶段训练。实验结果表明,我们的MLCNN实现了最先进的性能。特别是,我们的MLCNN在UCF_CC_50数据集上达到242.4的平均绝对误差(MAE),这是37.2低于第二个最佳结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号