首页> 外文会议>Asian Conference on Computer Vision >A Comparative Study of Encoding, Pooling and Normalization Methods for Action Recognition
【24h】

A Comparative Study of Encoding, Pooling and Normalization Methods for Action Recognition

机译:编码、池和规范化方法在动作识别中的比较研究

获取原文

摘要

Bag of visual words (BoVW) models have been widely and successfully used in video based action recognition. One key step in constructing BoVW representation is to encode feature with a codebook. Recently, a number of new encoding methods have been developed to improve the performance of BoVW based object recognition and scene classification, such as soft assignment encoding [1], sparse encoding [2], locality-constrained linear encoding [3] and Fisher kernel encoding [4]. However, their effects for action recognition are still unknown. The main objective of this paper is to evaluate and compare these new encoding methods in the context of video based action recognition. We also analyze and evaluate the combination of encoding methods with different pooling and normalization strategies. We carry out experiments on KTH dataset [5] and HMDB51 dataset [6]. The results show the new encoding methods can significantly improve the recognition accuracy compared with classical VQ. Among them, Fisher kernel encoding and sparse encoding have the best performance. By properly choosing pooling and normalization methods, we achieve the state-of-the-art performance on HMDB51.
机译:视觉单词包(BoVW)模型在基于视频的动作识别中得到了广泛而成功的应用。构建BoVW表示的一个关键步骤是使用码本对特征进行编码。最近,为了提高基于BoVW的目标识别和场景分类的性能,人们开发了一些新的编码方法,如软分配编码[1]、稀疏编码[2]、局部约束线性编码[3]和Fisher核编码[4]。然而,它们对动作识别的影响尚不清楚。本文的主要目的是在基于视频的动作识别中评估和比较这些新的编码方法。我们还分析和评估了编码方法与不同的池和规范化策略的组合。我们在KTH数据集[5]和HMDB51数据集[6]上进行了实验。结果表明,与经典矢量量化方法相比,新的编码方法可以显著提高识别精度。其中,Fisher核编码和稀疏编码的性能最好。通过正确选择池和规范化方法,我们在HMDB51上实现了最先进的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号