首页> 外文会议>European conference on computer vision >People Counting in Videos by Fusing Temporal Cues from Spatial Context-Aware Convolutional Neural Networks
【24h】

People Counting in Videos by Fusing Temporal Cues from Spatial Context-Aware Convolutional Neural Networks

机译:通过从空间环境感知卷积神经网络中融合时间线索来计算视频的人们

获取原文

摘要

We present an efficient method for people counting in video sequences from fixed cameras by utilising the responses of spatially context-aware convolutional neural networks (CNN) in the temporal domain. For stationary cameras, the background information remains fairly static, while foreground characteristics, such as size and orientation may depend on their image location, thus the use of whole frames for training a CNN improves the differentiation between background and foreground pixels. Foreground density representing the presence of people in the environment can then be associated with people counts. Moreover the fusion, of the responses of count estimations, in the temporal domain, can further enhance the accuracy of the final count. Our methodology was tested using the publicly available Mall dataset and achieved a mean deviation error of 0.091.
机译:我们通过利用时间域中的空间上下文感知卷积神经网络(CNN)的响应,为来自固定摄像机的视频序列的人们提供了一种有效的方法。对于静止摄像机,背景信息仍然相当静态,而诸如尺寸和方向的前景特性可能取决于其图像位置,因此使用整个帧训练CNN的使用改善了背景和前景像素之间的差异。代表环境中人们存在的前景密度可以与人数相关联。此外,融合,计数估计的响应,在时间域中,可以进一步增强最终计数的准确性。我们的方法使用公开的商城数据集进行了测试,并实现了0.091的平均偏差误差。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号