首页> 外文期刊>Neurocomputing >Human action recognition based on spatio-temporal three-dimensional scattering transform descriptor and an improved VLAD feature encoding algorithm
【24h】

Human action recognition based on spatio-temporal three-dimensional scattering transform descriptor and an improved VLAD feature encoding algorithm

机译:基于时空三维散射变换描述符和改进的VLAD特征编码算法的人体动作识别

获取原文
获取原文并翻译 | 示例

摘要

The local spatio-temporal descriptor and feature encoding algorithm are two crucial key steps for human action recognition based on spatio-temporal interest points (STIP). Since the local descriptors for STIP are essentially a type of motion information based on the texture, the key point of local feature description is to extract invariable, robust and distinguishable local texture features and motion information in reference spatio-temporal volume. Scattering transform is an image transform method based on directional wavelet transform and scale convolution, which has local translation invariance, rotation invariance and elastic deformation stability for local texture features. A novel local descriptor for STIP based on spatio-temporal three-dimensional scattering transform is proposed in this paper, which extends the original scattering transform to spatio-temporal three-dimensional space. Compared to the traditional descriptors, such as HOG, HOF and so on, the proposed scattering transform coefficients based histogram of oriented gradients (STC-HOG) descriptor can capture more robust and distinguishable motion information of local texture for STIP. In order to incorporate the local descriptors into action video representation, the feature encoding algorithm is indispensable. For the problem that vector of locally aggregated descriptors (VLAD) loses feature distribution location information during feature encoding, a histogram of distribution vector of locally aggregated descriptors (HOD-VALD) based on Gaussian kernel is proposed. We validated the proposed algorithm for human action recognition on multiple public available datasets, such as KTH, UCF Sports, HMDB51 and so on. The evaluation experiment results indicate that the proposed descriptor and encoding method can improve the efficiency of human action recognition and the recognition accuracy. (C) 2018 Elsevier B.V. All rights reserved.
机译:局部时空描述符和特征编码算法是基于时空兴趣点(STIP)的人类动作识别的两个关键关键步骤。由于STIP的局部描述符本质上是一种基于纹理的运动信息,因此局部特征描述的关键是提取参考时空量中的不变,鲁棒和可区分的局部纹理特征和运动信息。散射变换是一种基于方向小波变换和尺度卷积的图像变换方法,具有局部平移不变性,旋转不变性和局部纹理特征的弹性变形稳定性。提出了一种新颖的基于时空三维散射变换的STIP局部描述符,将原先的散射变换扩展到时空三维空间。与HOG,HOF等传统描述符相比,所提出的基于散射变换系数的定向梯度直方图(STC-HOG)描述符可以为STIP捕获更健壮和可区分的局部纹理运动信息。为了将局部描述符合并到动作视频表示中,特征编码算法是必不可少的。针对局部聚集描述符向量(VLAD)在特征编码过程中丢失特征分布位置信息的问题,提出了一种基于高斯核的局部聚集描述符分布向量直方图(HOD-VALD)。我们在KTH,UCF Sports,HMDB51等多个公共可用数据集上验证了提出的用于人类动作识别的算法。评估实验结果表明,所提出的描述符和编码方法可以提高人体动作识别的效率和识别精度。 (C)2018 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号