首页> 外文期刊>Pattern recognition letters >T-VLAD: Temporal vector of locally aggregated descriptor for multiview human action recognition
【24h】

T-VLAD: Temporal vector of locally aggregated descriptor for multiview human action recognition

机译:T-VLAD:多视图人体行动识别的局部聚合描述符的时间向量

获取原文
获取原文并翻译 | 示例
           

摘要

Robust view-invariant human action recognition (HAR) requires effective representation of its temporal structure in multi-view videos. This study explores a view-invariant action representation based on convolutional features. Action representation over long video segments is computationally expensive, whereas features in short video segments limit the temporal coverage locally. Previous methods are based on complex multi-stream deep convolutional feature maps extracted over short segments. To cope with this issue, a novel framework is proposed based on a temporal vector of locally aggregated descriptors (TVLAD). T-VLAD encodes long term temporal structure of the video employing single stream convolutional features over short segments. A standard VLAD vector size is a multiple of its feature codebook size (256 is normally recommended). VLAD is modified to incorporate time-order information of segments, where the T-VLAD vector size is a multiple of its smaller time-order codebook size. Previous methods have not been extensively validated for view-variation. Results are validated in a challenging setup, where one view is used for testing and the remaining views are used for training. State-of-the-art results have been obtained on three multi-view datasets with fixed cameras, IXMAS, MuHAVi and MCAD. Also, the proposed encoding approach T-VLAD works equally well on a dynamic background dataset, UCF101. (c) 2021 Elsevier B.V. All rights reserved.
机译:强大的视图 - 不变的人类行动识别(Har)需要有效地表示其在多视图视频中的时间结构。本研究探讨了基于卷积特征的视图 - 不变动作表示。长视频段的动作表示是计算昂贵的,而短视频段中的特征限制了本地的时间覆盖。以前的方法基于在短段中提取的复杂多流深卷积特征图。为了应对这个问题,基于当地聚合描述符(TVLAD)的时间向量提出了一种新颖的框架。 T-VLAD对使用短段的单流卷积特征的视频的长期时间结构进行编码。标准VLAD矢量大小是其特征码本尺寸的倍数(通常建议使用256)。修改VLAD以结合段的时间顺序信息,其中T-VLAD矢量大小是其较小的时序码本大小的倍数。以前的方法尚未广泛验证查看变化。结果在一个具有挑战性的设置中验证,其中一个视图用于测试,并且剩余视图用于培训。已经在具有固定摄像机,IXMAS,MUHAVI和MCAD的三个多视图数据集上获得最先进的结果。此外,建议的编码方法T-VLAD在动态背景数据集UCF101上同样良好工作。 (c)2021 elestvier b.v.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号