首页> 外文期刊>IEEE Transactions on Image Processing >Modeling Geometric-Temporal Context With Directional Pyramid Co-Occurrence for Action Recognition
【24h】

Modeling Geometric-Temporal Context With Directional Pyramid Co-Occurrence for Action Recognition

机译:具有方向金字塔共现的动作识别的时态上下文建模

获取原文
获取原文并翻译 | 示例

摘要

In this paper, we present a new geometric-temporal representation for visual action recognition based on local spatio-temporal features. First, we propose a modified covariance descriptor under the log-Euclidean Riemannian metric to represent the spatio-temporal cuboids detected in the video sequences. Compared with previously proposed covariance descriptors, our descriptor can be measured and clustered in Euclidian space. Second, to capture the geometric-temporal contextual information, we construct a directional pyramid co-occurrence matrix (DPCM) to describe the spatio-temporal distribution of the vector-quantized local feature descriptors extracted from a video. DPCM characterizes the co-occurrence statistics of local features as well as the spatio-temporal positional relationships among the concurrent features. These statistics provide strong descriptive power for action recognition. To use DPCM for action recognition, we propose a directional pyramid co-occurrence matching kernel to measure the similarity of videos. The proposed method achieves the state-of-the-art performance and improves on the recognition performance of the bag-of-visual-words (BOVWs) models by a large margin on six public data sets. For example, on the KTH data set, it achieves 98.78% accuracy while the BOVW approach only achieves 88.06%. On both Weizmann and UCF CIL data sets, the highest possible accuracy of 100% is achieved.
机译:在本文中,我们提出了一种基于局部时空特征的视觉动作识别的新几何时态表示。首先,我们提出了一种改进的对数欧氏黎曼度量下的协方差描述符,以表示在视频序列中检测到的时空长方体。与先前提出的协方差描述符相比,我们的描述符可以在欧几里得空间中进行测量和聚类。其次,为了捕获几何时态上下文信息,我们构造了一个定向金字塔共现矩阵(DPCM)来描述从视频中提取的矢量量化的局部特征描述符的时空分布。 DPCM表征局部特征的共现统计以及并发特征之间的时空位置关系。这些统计数据为动作识别提供了强大的描述能力。为了使用DPCM进行动作识别,我们提出了一种定向金字塔共现匹配内核来测量视频的相似性。所提出的方法达到了最先进的性能,并且在六个公共数据集上大大提高了视觉袋(BOVW)模型的识别性能。例如,在KTH数据集上,它达到98.78%的准确性,而BOVW方法仅达到88.06%。在Weizmann和UCF CIL数据集上,都达到了100%的最高精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号