首页> 外文学位 >Language Motivated Approaches for Human Action Recognition and Spotting.
【24h】

Language Motivated Approaches for Human Action Recognition and Spotting.

机译:语言动机的人类动作识别和发现方法。

获取原文
获取原文并翻译 | 示例

摘要

Action recognition has become an important area of computer vision research. "Given a sequence of images with people performing different actions over time, can a system be designed to automatically recognize what action is being performed in the sequence, and in what specific frames it occurred?". Till date, much of the computer vision community has approached this problem from a single action perspective where the problem is reduced to classifying a sequence of images containing one action. Hence given an image sequence, the assumption already exists that only one major action from a known class of actions occurs in that sequence. This dissertation targets not only the recognition of actions, but also the problem of spotting actions (or localization) from video data.;Our proposed approach involves the sharing of sub-actions to understand the underlying patterns of motions in actions and the use of these for recognition and spotting. Firstly, as a proof-of-concept, we build a framework using a predefined sequence of sub-actions to model an action. We then perform experiments to show that our framework is indeed useful for action recognition and spotting. Next, we build upon our previous approach and learn sub-actions automatically rather than defining them manually. In order to obtain statistical insight into the underlying patterns of motions in actions, we have developed a dynamic, hierarchical Bayesian model which connects low-level visual features in videos with poses, motion patterns and classes of activities. This process is somewhat analogous to the method of detecting topics or categories from documents based on the word content of the documents, except that our documents are dynamic. The proposed generative model harnesses both the temporal ordering power of dynamic Bayesian networks such as hidden Markov models (HMMs) and the automatic clustering power of hierarchical Bayesian models such as the latent Dirichlet allocation (LDA) model.;We have introduced a probabilistic framework for detecting and localizing pre-specified actions (or ges- tures) in a video sequence, analogous to the use of filler models for keyword detection in speech processing. We demonstrate the robustness of our classification model and our spotting framework by recognizing actions in unconstrained real-life video sequences and by spotting gestures via a one-shot-learning approach. Due to advancements in human action recognition, there are currently several publicly available datasets which have a large number of actions collected from various sources of media, reflecting real world scenarios. We have evaluated the proposed methods on these datasets and outperformed several techniques described in the literature.;We have proposed a new robust framework for modeling actions which gives a better insight into building blocks of actions rather than just performing recognition.
机译:动作识别已成为计算机视觉研究的重要领域。 “考虑到随着时间的流逝人们执行不同动作的图像序列,是否可以将系统设计为自动识别序列中正在执行的动作以及发生在哪些特定帧中?”到目前为止,许多计算机视觉社区已经从单一动作的角度解决了该问题,该问题被简化为对包含一个动作的一系列图像进行分类。因此,在给定一个图像序列的情况下,已经存在一个假设,即在该序列中仅发生了已知动作类别中的一个主要动作。本文不仅针对动作的识别,还针对从视频数据中识别动作(或本地化)的问题。;我们提出的方法涉及共享子动作,以了解动作中动作的潜在模式以及这些动作的使用用于识别和发现。首先,作为概念验证,我们使用预定义的子操作序列构建框架以对操作进行建模。然后,我们进行实验以表明我们的框架对于动作识别和发现确实有用。接下来,我们以之前的方法为基础,并自动学习子操作,而不是手动定义它们。为了获得对动作中潜在动作模式的统计了解,我们开发了一种动态的,分层的贝叶斯模型,该模型将视频中的低级视觉特征与姿势,动作模式和活动类别联系在一起。此过程有点类似于根据文档的单词内容从文档中检测主题或类别的方法,只是我们的文档是动态的。所提出的生成模型既利用了动态贝叶斯网络的时间排序能力,例如隐马尔可夫模型(HMM),又利用了分层贝叶斯模型的自动聚类能力,例如潜在的Dirichlet分配(LDA)模型。检测和定位视频序列中的预定动作(或手势),类似于在语音处理中使用填充模型进行关键字检测。我们通过识别不受约束的真实视频序列中的动作并通过一次学习方法来识别手势,从而证明了分类模型和识别框架的鲁棒性。由于人类动作识别技术的进步,当前存在一些可公开获得的数据集,这些数据集具有从各种媒体来源收集的大量动作,反映了现实世界的情况。我们已经对这些数据集评估了所提出的方法,并且胜过了文献中描述的几种技术。;我们提出了一种新的健壮的动作建模框架,该框架可以更好地洞察动作的组成部分,而不仅仅是执行识别。

著录项

  • 作者单位

    State University of New York at Buffalo.;

  • 授予单位 State University of New York at Buffalo.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2013
  • 页码 111 p.
  • 总页数 111
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号