...
首页> 外文期刊>IEEE Transactions on Image Processing >View-Invariant Deep Architecture for Human Action Recognition Using Two-Stream Motion and Shape Temporal Dynamics
【24h】

View-Invariant Deep Architecture for Human Action Recognition Using Two-Stream Motion and Shape Temporal Dynamics

机译:使用双流动作和形状时间动态进行观看 - 不变的人类行动识别的深层架构

获取原文
获取原文并翻译 | 示例
           

摘要

Human action Recognition for unknown views, is a challenging task. We propose a deep view-invariant human action recognition framework, which is a novel integration of two important action cues: motion and shape temporal dynamics (STD). The motion stream encapsulates the motion content of action as RGB Dynamic Images (RGB-DIs), which are generated by Approximate Rank Pooling (ARP) and processed by using fine-tuned InceptionV3 model. The STD stream learns long-term view-invariant shape dynamics of action using a sequence of LSTM and Bi-LSTM learning models. Human Pose Model (HPM) generates view-invariant features of structural similarity index matrix (SSIM) based key depth human pose frames. The final prediction of the action is made on the basis of three types of late fusion techniques i.e. maximum (max), average (avg) and multiply (mul), applied on individual stream scores. To validate the performance of the proposed novel framework, the experiments are performed using both cross-subject and cross-view validation schemes on three publically available benchmarks-NUCLA multi-view dataset, UWA3D-II Activity dataset and NTU RGB-D Activity dataset. Our algorithm outperforms existing state-of-the-arts significantly, which is measured in terms of recognition accuracy, receiver operating characteristic (ROC) curve and area under the curve (AUC).
机译:人类行动认可为未知观点,是一个具有挑战性的任务。我们提出了一个深刻的观点,不变的人类行动识别框架,这是两个重要行动提示的新一体化:运动和形状时间动态(STD)。运动流将动作的运动内容封装为RGB动态图像(RGB-DIS),该动态图像(RGB-DIS)由近似排名池(ARP)生成并通过使用微调IncepionV3模型进行处理。 STD流使用一系列LSTM和Bi-LSTM学习模型学习操作的长期视图 - 不变形状动态。人类姿势模型(HPM)生成基于结构相似索引矩阵(SSIM)密钥深度人姿势帧的视图 - 不变特征。基于三种类型的晚期融合技术的最终预测是在各个流分数上应用的最大(最大),平均(AVG)和乘法(MUL)。为了验证所提出的新颖框架的性能,使用三个公开可用的基准 - Nucla多视图数据集,UWA3D-II活动数据集和NTU RGB-D活动数据集来执行实验。我们的算法显着优于现有的现有最先进,这在识别精度,接收器操作特征(ROC)曲线和曲线(AUC)的区域来衡量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号