Multiscale Multitask Deep NetVLAD for Crowd Counting

Zenglin Shi; Le Zhang; Yibo Sun; Yangdong Ye

首页> 外文期刊>Industrial Informatics, IEEE Transactions on >Multiscale Multitask Deep NetVLAD for Crowd Counting

【24h】

Multiscale Multitask Deep NetVLAD for Crowd Counting

机译：用于人群计数的多尺度多任务深度NetVLAD

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Deep convolutional networks (CNNs) reign undisputed as the new de-facto method for computer vision tasks owning to their success in visual recognition task on still images. However, their adaptations to crowd counting have not clearly established their superiority over shallow models. Existing CNNs turn out to be self-limiting in challenging scenarios such as camera illumination changing, partial occlusions, diverse crowd distributions, and perspective distortions for crowd counting because of their shallow structure. In this paper, we introduce a dynamic augmentation technique to train a much deeper CNN for crowd counting. In order to decrease overfitting caused by limited number of training samples, multitask learning is further employed to learn generalizable representations across similar domains. We also propose to aggregate multiscale convolutional features extracted from the entire image into a compact single vector representation amenable to efficient and accurate counting by way of “Vector of Locally Aggregated Descriptors” (VLAD). The “deeply supervised” strategy is employed to provide additional supervision signal for bottom layers for further performance improvement. Experimental results on three benchmark crowd datasets show that our method achieves better performance than the existing methods. Our implementation will be released at https://github.com/shizenglin/Multitask-Multiscale-Deep-NetVLAD.

机译：深度卷积网络（CNN）作为计算机视觉任务的事实上的新方法而无可争议，这要归功于它们在静止图像上的视觉识别任务中的成功。但是，他们对人群计数的适应性并未明确确立其优于浅层模型的优势。事实证明，现有的CNN在具有挑战性的场景中具有自限性，例如摄像机照明变化，部分遮挡，人群分布多样以及由于人群浅而造成的人群计数透视失真。在本文中，我们引入了动态增强技术来训练更深的CNN以进行人群计数。为了减少由有限数量的训练样本引起的过拟合，进一步采用多任务学习来学习跨相似域的可概括表示。我们还建议通过“局部聚合描述符向量”（VLAD）将从整个图像中提取的多尺度卷积特征聚合为一个紧凑的单个向量表示形式，以实现高效，准确的计数。 “深度监督”策略用于为底层提供额外的监督信号，以进一步提高性能。在三个基准人群数据集上的实验结果表明，我们的方法比现有方法具有更好的性能。我们的实现将在https://github.com/shizenglin/Multitask-Multiscale-Deep-NetVLAD上发布。

著录项

来源
《Industrial Informatics, IEEE Transactions on》 |2018年第11期|4953-4962|共10页
作者
Zenglin Shi; Le Zhang; Yibo Sun; Yangdong Ye;
展开▼
作者单位

School of Information Engineering, Zhengzhou University, Zhengzhou, China;

Advanced Digital Sciences Center, University of Illinois at Urbana-Champaign, Singapore;

School of Information Engineering, Zhengzhou University, Zhengzhou, China;

School of Information Engineering, Zhengzhou University, Zhengzhou, China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Feature extraction; Task analysis; Visualization; Computer vision; Training; Robustness; Convolution;

机译：特征提取;任务分析;可视化;计算机视觉;训练;健壮性;卷积;

相似文献

外文文献
中文文献
专利

1. Scale and density invariant head detection deep model for crowd counting in pedestrian crowds [J] . Khan Sultan Daud, Basalamah Saleh The Visual Computer . 2021,第8期

机译：秤和密度不变头检测人群人群中人群计数的深层模型
2. Automated Pathogenesis-Based Diagnosis of Lumbar Neural Foraminal Stenosis via Deep Multiscale Multitask Learning [J] . Zhongyi Han, Benzheng Wei, Stephanie Leung, Neuroinformatics . 2018,第3a4期

机译：深度多尺度多任务学习的腰椎神经传染性狭窄的自动致病性诊断
3. Deep Learning-Enabled Multitask System for Exercise Recognition and Counting [J] . Qingtian Yu, Haopeng Wang, Fedwa Laamarti, Multimodal Technologies and Interaction . 2021,第9期

机译：支持深度学习的多任务系统，用于运动识别和计数
4. Deep Learning Based Face Mask Detection and Crowd Counting [C] . Prithvi N. Amin, Sayali S. Moghe, Sparsh N. Prabhakar, International Conference for Convergence in Technology . 2021

机译：基于深度学习的面罩检测和人群计数
5. Automated Crowd-Counting System upon a Distributed Camera Network. [D] . Morrow, Mulloy. 2012

机译：分布式摄像机网络上的自动人群计数系统。
6. A Multitask Cascading CNN with MultiScale Infrared Optical Flow Feature Fusion-Based Abnormal Crowd Behavior Monitoring UAV [O] . Yanhua Shao, Wenfeng Li, Hongyu Chu, 2020

机译：具有多尺度红外光流量的多址级联CNN特征基于融合的异常人群行为监控UAV
7. ResnetCrowd: a residual deep learning architecture for crowd counting, violent behaviour detection and crowd density level classification [O] . Marsden , Mark, McGuinness, Kevin, Little, Suzanne, 2017

机译：ResnetCrowd：用于人群计数，暴力行为检测和人群密度级别分类的残留深度学习架构

Multiscale Multitask Deep NetVLAD for Crowd Counting

摘要

著录项

相似文献

相关主题

期刊订阅