首页> 外文期刊>Multimedia Tools and Applications >Sparse coding-based space-time video representation for action recognition
【24h】

Sparse coding-based space-time video representation for action recognition

机译:基于稀疏编码的时空视频表示,用于动作识别

获取原文
获取原文并翻译 | 示例
           

摘要

Methods based on feature descriptors around local interest points are now widely used in action recognition. Feature points are detected using a number of measures, namely saliency, periodicity, motion activity etc. Each of these measures is usually intensity-based and provides a trade-off between density and informativeness. In this paper, we address the problem of action recognition by representing image sequences as a sparse collection of patch-level space-time events that are salient in both space and time domain. Our method uses a multi-scale volumetric representation of video and adaptively selects an optimal space-time scale under which the saliency of a patch is most significant. The input image sequences are first partitioned into non-overlapping patches. Then, each patch is represented by a vector of coefficients that can linearly reconstruct the patch from a learned dictionary of basis patches. The space-time saliency of patches is measured by Shannon's self-information entropy, where a patch's saliency is determined by information variation in the contents of the patch's spatiotemporal neighborhood. Experimental results on three benchmark datasets demonstrate the effectiveness of the proposed method.
机译:基于局部兴趣点周围特征描述符的方法现已广泛用于动作识别中。特征点是使用多种度量来检测的,即显着性,周期性,运动活动等。这些度量中的每一种通常都是基于强度的,并在密度和信息量之间进行权衡。在本文中,我们通过将图像序列表示为稀疏的补丁程序级时空事件集合来解决动作识别问题,这些事件在时域和时域上都是显着的。我们的方法使用视频的多尺度体积表示,并自适应地选择最佳时空尺度,在该尺度下补丁的显着性最为显着。首先将输入图像序列划分为不重叠的块。然后,每个补丁都由一个系数向量表示,该系数向量可以从学习的基础补丁字典中线性地重建补丁。补丁的时空显着性是通过Shannon的自我信息熵来衡量的,其中补丁的显着性是由补丁时空邻域的内容中的信息变化来确定的。在三个基准数据集上的实验结果证明了该方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号