首页> 外文会议>Conference on Information Storage and Processing Systems >TCN UNITS, SOLUTION IN RECOGNITION OF HUMAN ACTIVITIES
【24h】

TCN UNITS, SOLUTION IN RECOGNITION OF HUMAN ACTIVITIES

机译:TCN单位,解决人类活动的解决方案

获取原文

摘要

Activities and human faces recognition infrastructure are used everywhere behavior analysis is required. Since a video approach is built over an already existing infrastructure comprised of CCTV-Closed Circuit Television and a central computer, it can be used in any space where actions' monitoring is necessary. Main objective of this paper consists of building a reliable and lightweight human faces and actions' recognition classifier, able to classify a large number of actions, lightweight enough that it can work in real time. It processes at least 30 frames per second, using of-the-shelf computer hardware, connected to a normal CCTV infrastructure. The temporal convolutional network - TCN represents a viable solution for a proposed problem. It classifies a large number of actions - 60, using only RGB (red-green-blue) images of fairly low resolution, in real time. Deciding which class of action belongs to should not be connected to environment, background, person, view angle, or other specific identifiers. This selection should be associated only with the person executing it and the spatial-temporal context of the person. As technology and processing power improve, the problem slightly shifts. When more processing power to a system is added, in this model is possible either to increase the number of frames per second or the number of cameras in the infrastructure, or to increase the quality of the images, resulting most likely higher accuracy of the predictions. This model can be extended to a larger number of classes, with a minimal impact on performance. The proposed model has a tested accuracy of 82% which can be attributed to the recurrent property of the network. The model performs close to the most performing existing solutions. The present TCN + 3D Convolution Model is built with the smaller TCN units. Its architecture uses an alternation of a Simple Unit and a Complex Unit, in order to maximize the diversity of features the model learns. This paper illustrates a deep learning classifier based on TCNs for human actions recognition. Is relatively lightweight compared to other methods, and performs very well, competing with the best architectures. Ideally, it is able to classify an action irrespective of the person executing it or the environment where it was executed. This is achieved as much as possible through a diverse dataset on which the model is trained and tested, namely NTU RGB+D. After a simple and a complex unit, an Average Pool 3D layer reduces at least one dimension to half.
机译:使用活动和人类面部识别基础设施到处都需要进行行为分析。由于在已经由CCTV闭合电路电视和中央计算机组成的已经存在的现有基础架构上构建了视频方法,因此它可以在需要动作监控的任何空间中使用它。本文的主要目标包括建立可靠而轻巧的人类面和行动的识别分类器,能够对大量的动作进行分类,重量轻,它可以实时工作。它每秒处理至少30帧,使用货架计算机硬件连接到正常的CCTV基础架构。时间卷积网络 - TCN代表了提出的问题的可行解决方案。它根据实时使用相当低的分辨率的RGB(红绿蓝色)图像来分类大量动作 - 60。决定哪种行动属于不应该连接到环境,背景,人,视角或其他特定标识符。此选择仅与执行IT的人和人的空间时间上下文相关联。随着技术和加工能力的改善,问题略微偏移。当添加到系统的更多处理电源时,在该模型中,可以增加每秒的帧数或基础设施中的摄像机的数量,或者增加图像的质量,从而导致最有可能的预测精度更高的准确性。该模型可以扩展到更大数量的类,对性能的影响最小。所提出的模型具有82%的测试精度,其可归因于网络的复发性质。该模型执行接近最令人效力的现有解决方案。目前的TCN + 3D卷积模型采用较小的TCN单位构建。其架构使用简单单元和复杂单元的交替,以便最大化模型学习的功能的分集。本文说明了基于针对人类动作识别的TCN的深层学习分类器。与其他方法相比相对轻,并且表现得非常好,与最佳架构竞争。理想情况下,无论执行它的人还是执行它的环境,它都能够对动作进行分类。这是通过在培训和测试模型的不同数据集中实现的,即NTU RGB + D。在简单且复杂的单元之后,平均池3D层将至少一个维度降低到一半。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号