首页> 外文期刊>Circuits and Systems for Video Technology, IEEE Transactions on >Crowd Counting via Weighted VLAD on a Dense Attribute Feature Map
【24h】

Crowd Counting via Weighted VLAD on a Dense Attribute Feature Map

机译:通过密集属性特征图上的加权VLAD进行人群计数

获取原文
获取原文并翻译 | 示例

摘要

Crowd counting is an important task in computer vision, which has many applications in video surveillance. Although the regression-based framework has achieved great improvements for crowd counting, how to improve the discriminative power of image representation is still an open problem. Conventional holistic features used in crowd counting often fail to capture semantic attributes and spatial cues of the image. In this paper, we propose integrating semantic information into learning locality-aware feature (LAF) sets for accurate crowd counting. First, with the help of a convolutional neural network, the original pixel space is mapped onto a dense attribute feature map, where each dimension of the pixelwise feature indicates the probabilistic strength of a certain semantic class. Then, LAF built on the idea of spatial pyramids on neighboring patches is proposed to explore more spatial context and local information. Finally, the traditional vector of locally aggregated descriptor (VLAD) encoding method is extended to a more generalized form weighted-VLAD (W-VLAD) in which diverse coefficient weights are taken into consideration. Experimental results validate the effectiveness of our presented method.
机译:人群计数是计算机视觉中的一项重要任务,计算机视觉在视频监控中有许多应用。尽管基于回归的框架在人群计数方面取得了很大的进步,但是如何提高图像表示的判别力仍然是一个悬而未决的问题。用于人群计数的常规整体特征通常无法捕获图像的语义属性和空间线索。在本文中,我们建议将语义信息集成到学习位置感知功能(LAF)集中,以进行准确的人群计数。首先,借助卷积神经网络,将原始像素空间映射到一个密集的属性特征图上,其中逐像素特征的每个维度都表示某个语义类的概率强度。然后,提出了基于相邻斑块上的空间金字塔思想的LAF,以探索更多的空间背景和本地信息。最后,将传统的局部聚集描述符(VLAD)编码方法扩展到考虑了不同系数权重的更广义形式的加权VLAD(W-VLAD)。实验结果验证了我们提出的方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号