首页> 外文学位 >Language Motivated Approaches for Human Action Recognition and Spotting.

【24h】

Language Motivated Approaches for Human Action Recognition and Spotting.

机译：语言动机的人类动作识别和发现方法。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Action recognition has become an important area of computer vision research. "Given a sequence of images with people performing different actions over time, can a system be designed to automatically recognize what action is being performed in the sequence, and in what specific frames it occurred?". Till date, much of the computer vision community has approached this problem from a single action perspective where the problem is reduced to classifying a sequence of images containing one action. Hence given an image sequence, the assumption already exists that only one major action from a known class of actions occurs in that sequence. This dissertation targets not only the recognition of actions, but also the problem of spotting actions (or localization) from video data.;Our proposed approach involves the sharing of sub-actions to understand the underlying patterns of motions in actions and the use of these for recognition and spotting. Firstly, as a proof-of-concept, we build a framework using a predefined sequence of sub-actions to model an action. We then perform experiments to show that our framework is indeed useful for action recognition and spotting. Next, we build upon our previous approach and learn sub-actions automatically rather than defining them manually. In order to obtain statistical insight into the underlying patterns of motions in actions, we have developed a dynamic, hierarchical Bayesian model which connects low-level visual features in videos with poses, motion patterns and classes of activities. This process is somewhat analogous to the method of detecting topics or categories from documents based on the word content of the documents, except that our documents are dynamic. The proposed generative model harnesses both the temporal ordering power of dynamic Bayesian networks such as hidden Markov models (HMMs) and the automatic clustering power of hierarchical Bayesian models such as the latent Dirichlet allocation (LDA) model.;We have introduced a probabilistic framework for detecting and localizing pre-specified actions (or ges- tures) in a video sequence, analogous to the use of filler models for keyword detection in speech processing. We demonstrate the robustness of our classification model and our spotting framework by recognizing actions in unconstrained real-life video sequences and by spotting gestures via a one-shot-learning approach. Due to advancements in human action recognition, there are currently several publicly available datasets which have a large number of actions collected from various sources of media, reflecting real world scenarios. We have evaluated the proposed methods on these datasets and outperformed several techniques described in the literature.;We have proposed a new robust framework for modeling actions which gives a better insight into building blocks of actions rather than just performing recognition.

机译：动作识别已成为计算机视觉研究的重要领域。 “考虑到随着时间的流逝人们执行不同动作的图像序列，是否可以将系统设计为自动识别序列中正在执行的动作以及发生在哪些特定帧中？”到目前为止，许多计算机视觉社区已经从单一动作的角度解决了该问题，该问题被简化为对包含一个动作的一系列图像进行分类。因此，在给定一个图像序列的情况下，已经存在一个假设，即在该序列中仅发生了已知动作类别中的一个主要动作。本文不仅针对动作的识别，还针对从视频数据中识别动作（或本地化）的问题。；我们提出的方法涉及共享子动作，以了解动作中动作的潜在模式以及这些动作的使用用于识别和发现。首先，作为概念验证，我们使用预定义的子操作序列构建框架以对操作进行建模。然后，我们进行实验以表明我们的框架对于动作识别和发现确实有用。接下来，我们以之前的方法为基础，并自动学习子操作，而不是手动定义它们。为了获得对动作中潜在动作模式的统计了解，我们开发了一种动态的，分层的贝叶斯模型，该模型将视频中的低级视觉特征与姿势，动作模式和活动类别联系在一起。此过程有点类似于根据文档的单词内容从文档中检测主题或类别的方法，只是我们的文档是动态的。所提出的生成模型既利用了动态贝叶斯网络的时间排序能力，例如隐马尔可夫模型（HMM），又利用了分层贝叶斯模型的自动聚类能力，例如潜在的Dirichlet分配（LDA）模型。检测和定位视频序列中的预定动作（或手势），类似于在语音处理中使用填充模型进行关键字检测。我们通过识别不受约束的真实视频序列中的动作并通过一次学习方法来识别手势，从而证明了分类模型和识别框架的鲁棒性。由于人类动作识别技术的进步，当前存在一些可公开获得的数据集，这些数据集具有从各种媒体来源收集的大量动作，反映了现实世界的情况。我们已经对这些数据集评估了所提出的方法，并且胜过了文献中描述的几种技术。;我们提出了一种新的健壮的动作建模框架，该框架可以更好地洞察动作的组成部分，而不仅仅是执行识别。

著录项

作者
Malgireddy, Manavender Reddy.;
展开▼
作者单位

State University of New York at Buffalo.;

展开▼
授予单位 State University of New York at Buffalo.;
学科 Computer Science.
学位 Ph.D.
年度 2013
页码 111 p.
总页数 111
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Language-Motivated Approaches to Action Recognition [J] . Manavender R. Malgireddy, Ifeoma Nwogu, Venu Govindaraju Journal of machine learning research . 2013,第Mar期

机译：语言动机的动作识别方法
2. Approaches for Multilingual Phone Recognition in Code-switched and Non-code-switched Scenarios Using Indian Languages [J] . Manjunath K. E., Raghavan Srinivasa K. M., Rao K. Sreenivasa, ACM transactions on Asian and low-resource language information processing . 2021,第4期

机译：使用印度语言的代码切换和非码切换方案中的多语言电话识别方法
3. EFFICIENT CONTINUOUS SPEECH RECOGNITION APPROACHES FOR DRAVIDIAN LANGUAGES [J] . J Sangeetha, Jothilakshmi S, RN Devendra Kumar International journal of simulation: systems, science and technology . 2014,第2期

机译：DRAVIDIAN语言的有效连续语音识别方法
4. Improving Deep Learning Approaches for Human Activity Recognition based on Natural Language Processing of Action Labels [C] . Konstantinos Bacharidis, Antonis Argyros International Joint Conference on Neural Networks . 2020

机译：基于动作标签自然语言处理的人类活动识别深度学习方法的改进
5. Human action recognition on videos: Different approaches. [D] . Mejia Salazar, Maria Helena. 2012

机译：视频上的人类动作识别：不同的方法。
6. Obtaining Knowledge in Pathology Reports Through a Natural Language Processing Approach With Classification Named-Entity Recognition and Relation-Extraction Heuristics [O] . Tomasz Oliwa, Steven B. Maron, Leah M. Chase, -1

机译：通过具有分类命名实体识别和关系提取启发式的自然语言处理方法获取病理报告中的知识
7. A comparison between end-to-end approaches and feature extraction based approaches for Sign Language recognition [O] . Oliveira, Marlon, Chatbri, Houssem, Little, Suzanne, 2017

机译：端到端方法与基于特征提取的手语识别方法之间的比较
8. Human Capital: Selected Agency Actions to Integrate Human Capital Approaches to Attain Mission Results [R] . 2003

机译：人力资本：选定的机构行动，整合人力资本方法以获得任务结果

Language Motivated Approaches for Human Action Recognition and Spotting.

摘要

著录项

相似文献

相关主题

期刊订阅