首页> 外文期刊>Image Processing, IEEE Transactions on >Spatiotemporal Localization and Categorization of Human Actions in Unsegmented Image Sequences
【24h】

Spatiotemporal Localization and Categorization of Human Actions in Unsegmented Image Sequences

机译:未分割图像序列中人类动作的时空定位和分类

获取原文
获取原文并翻译 | 示例

摘要

In this paper we address the problem of localization and recognition of human activities in unsegmented image sequences. The main contribution of the proposed method is the use of an implicit representation of the spatiotemporal shape of the activity which relies on the spatiotemporal localization of characteristic ensembles of feature descriptors. Evidence for the spatiotemporal localization of the activity is accumulated in a probabilistic spatiotemporal voting scheme. The local nature of the proposed voting framework allows us to deal with multiple activities taking place in the same scene, as well as with activities in the presence of clutter and occlusion. We use boosting in order to select characteristic ensembles per class. This leads to a set of class specific codebooks where each codeword is an ensemble of features. During training, we store the spatial positions of the codeword ensembles with respect to a set of reference points, as well as their temporal positions with respect to the start and end of the action instance. During testing, each activated codeword ensemble casts votes concerning the spatiotemporal position and extend of the action, using the information that was stored during training. Mean Shift mode estimation in the voting space provides the most probable hypotheses concerning the localization of the subjects at each frame, as well as the extend of the activities depicted in the image sequences. We present classification and localization results for a number of publicly available datasets, and for a number of sequences where there is a significant amount of clutter and occlusion.
机译:在本文中,我们解决了未分割图像序列中人类活动的定位和识别问题。所提出方法的主要贡献是使用活动的时空形状的隐式表示,该隐式表示依赖于特征描述符的特征集合的时空局部化。活动的时空定位的证据是在概率时空投票方案中积累的。提议的投票框架的本地性质使我们能够处理在同一场景中发生的多种活动,以及在混乱和遮挡的情况下进行的活动。为了选择每个班级的特色合奏,我们使用了Boosting。这导致了一组特定于类的码本,其中每个码字都是功能的集合。在训练过程中,我们存储代码字集合相对于一组参考点的空间位置,以及它们相对于动作实例的开始和结束的时间位置。在测试过程中,每个激活的代码字集合都使用训练过程中存储的信息对时空位置和动作范围进行投票。投票空间中的均值平移模式估计提供了最可能的假设,这些假设涉及对象在每一帧的定位以及图像序列中描述的活动的扩展。我们提供了许多公开可用的数据集的分类和本地化结果,以及存在大量混乱和遮挡的许多序列的分类和本地化结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号