首页> 外文会议>Conference on Computer and Robot Visio >A Multi-Scale Hierarchical Codebook Method for Human Action Recognition in Videos Using a Single Example

【24h】

A Multi-Scale Hierarchical Codebook Method for Human Action Recognition in Videos Using a Single Example

机译：使用单一示例的视频中的人为行动识别的多尺度分层码本方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents a novel action matching method based on a hierarchical codebook of local spatio-temporal video volumes (STVs). Given a single example of an activity as a query video, the proposed method finds similar videos to the query in a video dataset. It is based on the bag of video words (BOV) representation and does not require prior knowledge about actions, background subtraction, motion estimation or tracking. It is also robust to spatial and temporal scale changes, as well as some deformations. The hierarchical algorithm yields a compact subset of salient code words of STVs for the query video, and then the likelihood of similarity between the query video and all STVs in the target video is measured using a probabilistic inference mechanism. This hierarchy is achieved by initially constructing a codebook of STVs, while considering the uncertainty in the codebook construction, which is always ignored in current versions of the BOV approach. At the second level of the hierarchy, a large contextual region containing many STVs (Ensemble of STVs) is considered in order to construct a probabilistic model of STVs and their spatio-temporal compositions. At the third level of the hierarchy a codebook is formed for the ensembles of STVs based on their contextual similarities. The latter are the proposed labels (code words) for the actions being exhibited in the video. Finally, at the highest level of the hierarchy, the salient labels for the actions are selected by analyzing the high level code words assigned to each image pixel as a function of time. The algorithm was applied to three available video datasets for action recognition with different complexities (KTH, Weizmann, and MSR II) and the results were superior to other approaches, especially in the cases of a single training example and cross-dataset action recognition.

机译：本文介绍了一种基于本地时空视频卷（STV）的分层码本的新型动作匹配方法。给定作为查询视频的活动的单个示例，所提出的方法在视频数据集中的查询中找到类似的视频。它基于视频单词（BOV）表示的袋子，并且不需要先验知识，以及关于动作，背景减法，运动估计或跟踪的知识。它对空间和时间尺度变化以及一些变形也是强大的。分层算法产生了查询视频的STV的显着码字的紧凑型子集，然后使用概率推断机制测量查询视频与目标视频中的所有STV之间的相似性的可能性。通过最初构建STV的码本，在考虑码本构造的不确定性的同时实现了该层次结构，这总是在BOV方法的当前版本中忽略。在层次结构的第二级，考虑包含许多STV（STVS的集合）的大型上下文区域，以构建STV和它们的时空组合物的概率模型。在层次结构的第三级，基于其上下文相似度，形成了用于STV的合奏的码本。后者是在视频中展出的行动的建议标签（代码词）。最后，在层次结构的最高级别，通过分析为时间的函数分配给每个图像像素的高级代码单词来选择用于动作的突出标签。该算法应用于三个可用视频数据集具有不同的复杂性（KTH，魏兹曼和MSR II），结果动作识别优于其它方法，尤其是在一个单一的训练示例和跨数据集动作识别的情况。

著录项

来源
《Conference on Computer and Robot Visio》|2012年||共8页
会议地点
作者
Roshtkhari Mehrsan Javan; Levine Martin D.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP3-53;
关键词

相似文献

外文文献
中文文献
专利

1. Human action recognition based on multi-scale feature maps from depth video sequences [J] . Li Chang, Huang Qian, Li Xing, Multimedia Tools and Applications . 2021,第21a23期

机译：基于多尺度特征映射的人为行动识别从深度视频序列映射
2. Effective Codebooks for Human Action Representation and Classification in Unconstrained Videos [J] . Ballan L., Bertini M., Del Bimbo A., Multimedia, IEEE Transactions on . 2012,第4期

机译：不受约束的视频中用于人类动作表示和分类的有效密码本
3. Dual-codebook learning and hierarchical transfer for cross-view action recognition [J] . Zhang Chengkun, Zheng Huicheng, Lai Jianhuang Journal of electronic imaging . 2018,第4期

机译：双码本学习和分层传输，可用于跨视图动作识别
4. A Multi-Scale Hierarchical Codebook Method for Human Action Recognition in Videos Using a Single Example [C] . Roshtkhari Mehrsan Javan, Levine Martin D. Computer and Robot Vision (CRV), 2012 Ninth Conference on . 2012

机译：使用单个示例的视频中人类动作识别的多尺度层次码本方法
5. Feature extraction method for video based human action recognitions: Extended optical flow algorithm. [D] . Ramadass, Ashok. 2010

机译：基于视频的人体动作识别的特征提取方法：扩展光流算法。
6. Gender Recognition from Human-Body Images Using Visible-Light and Thermal Camera Videos Based on a Convolutional Neural Network for Image Feature Extraction [O] . Dat Tien Nguyen, Ki Wan Kim, Hyung Gil Hong, 2017

机译：基于卷积神经网络的可见光和热成像摄像机视频对人体图像的性别识别
7. Effective Codebooks for Human Action Representation and Classification in Unconstrained Videos [O] . L. Ballan, M. Bertini, A. Del Bimbo, 2012

机译：不受约束的视频中用于人类动作表示和分类的有效密码本
8. Human Action Recognition in Surveillance Videos using Abductive Reasoning on Linear Temporal Logic. [R] . Basu, S., Stagg, M., DiBiano, R., 2012

机译：利用线性时态逻辑的诱导推理对监控视频中的人体行为识别。

A Multi-Scale Hierarchical Codebook Method for Human Action Recognition in Videos Using a Single Example

摘要

著录项

相似文献

相关主题

期刊订阅