Few-Shot Learning of Video Action Recognition Only Based on Video Contents

机译：仅基于视频内容的视频动作识别的少量学习

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The success of video action recognition based on Deep Neural Networks (DNNs) is highly dependent on a large number of manually labeled videos. In this paper, we introduce a supervised learning approach to recognize video actions with very few training videos. Specifically, we propose Temporal Attention Vectors (TAVs) which adapt various length videos to preserve the temporal information of the entire video. We evaluate the TAVs on UCF101 and HMDB51. Without training any deep 3D or 2D frame feature extractors on video datasets (only pre-trained on ImageNet), the TAVs only introduce 2.1M parameters but outperforms the state-of-the-art video action recognition benchmarks with very few labeled training videos (e.g. 92% on UCF101 and 59% on HMDB51, with 10 and 8 training videos per class, respectively). Furthermore, our approach can still achieve competitive results on full datasets (97.1% on UCF101 and 77% on HMDB51).

机译：基于深度神经网络（DNN）的视频动作识别的成功高度依赖于大量手动标记的视频。在本文中，我们介绍了一种监督学习方法，以很少的训练视频来识别视频动作。具体来说，我们提出了时间注意向量（TAV），这些向量适用于各种长度的视频，以保留整个视频的时间信息。我们评估UCF101和HMDB51上的TAV。在不对视频数据集进行任何深度3D或2D帧特征提取器训练的情况下（仅在ImageNet上进行了预先训练），TAV仅引入了2.1M参数，但在标记的训练视频很少的情况下，其性能却超过了最新的视频动作识别基准（例如UCF101的92％和HMDB51的59％，每节课分别有10和8个培训视频。此外，我们的方法仍然可以在完整的数据集上获得竞争性的结果（UCF101上为97.1％，HMDB51上为77％）。

著录项

来源
《IEEE Winter Conference on Applications of Computer Vision》|2020年|584-593|共10页
会议地点
作者
Yang Bo; Yangdi Lu; Wenbo He;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Training; Correlation; Supervised learning; Feature extraction; Visualization; Recurrent neural networks;

机译：培训;相关;监督学习;特征提取;可视化;递归神经网络;

相似文献

外文文献
中文文献
专利

1. Research on Action Recognition and Content Analysis in Videos Based on DNN and MLN [J] . Wei Song, Jing Yu, Xiaobing Zhao, Computers, Materials & Continua . 2019,第3期

机译：基于DNN和MLN视频的动作识别与内容分析研究
2. DAAL: Deep activation-based attribute learning for action recognition in depth videos [J] . Chenyang Zhang, Yingli Tian, Xiaojie Guo, Computer vision and image understanding . 2018,第FEBa期

机译：DAAL：基于深度激活的属性学习，用于深度视频中的动作识别
3. Audio-video based character recognition for handwritten mathematical content in classroom videos [J] . Smita Vemulapalli, Monson Hayes Integrated Computer-Aided Engineering . 2014,第3期

机译：基于音频视频的字符识别，用于教室视频中的手写数学内容
4. A Novel Dictionary Learning based Multiple Instance Learning Approach to Action Recognition from Videos [C] . Abhinaba Roy, Biplab Banerjee, Vittorio Murino International Conference on Pattern Recognition Applications and Methods . 2017

机译：一种基于多个实例学习方法的新型词典学习方法从视频的动作识别
5. 50,000 tiny videos: A large dataset for non-parametric content-based retrieval and recognition [D] . Karpenko, Alexandre. 2009

机译：50,000个微型视频：大型数据集，用于基于非参数内容的检索和识别
6. Effective Educational Videos: Principles and Guidelines for Maximizing Student Learning from Video Content [O] . Cynthia J. Brame 2016

机译：有效的教育视频：最大化学生从视频内容中学习的原则和准则
7. Occupational therapy intervention with a child is based upon an understanding and appreciation of normal development. Knowledge of current concepts and theories related to child development is essential when occupational therapist evaluates children. This background information helps therapist to plan intervention for the child. The aim of this study is to make observation video about development of about one year old child. The purpose of my study is to help occupational therapy students learn about child development. My study is practice-based thesis. It includes product, which is the observation video and study rapport. I describe my whole process in my rapport. The process includes different kinds of stages. First, I studied those theories of child development, which are used in the studies of occupational therapy for children. These theories are Moseys Developmental Frame of Reference and the theory of development according to Sensory Integration Theory. These theories are the frames of reference of my study. I organize the child development areas according to child occupations and skills. Then I start to plan, film and edit my video based on the theories of child development and the principles of making a video. In my rapport I describe all the stages of my study and explain the sequence and the content of the stages. I also evaluate the process of my study. In the observation video you can see those stages of development where about one year old child is based on the frames of reference, which I have used in my study. I believe that my observation video can at least be good for inspiring occupational therapy students learning about child development. Keywords child development, learning, observation video [O] . Lehtinen Ann-Mari 2006

机译：对儿童的职业治疗干预基于对正常发育的理解和欣赏。当职业治疗师评估儿童时，与儿童发育相关的当前概念和理论的知识必不可少。这些背景信息可帮助治疗师为孩子计划干预措施。这项研究的目的是制作有关约一岁儿童发育的观察视频。我研究的目的是帮助职业治疗学生学习儿童成长。我的研究是基于实践的论文。它包括产品，这是观察视频和学习融洽的关系。我以融洽的方式描述我的整个过程。该过程包括不同阶段。首先，我研究了有关儿童发育的理论，这些理论被用于儿童的职业治疗研究中。这些理论是Moseys发展参考框架和根据感觉统合理论的发展理论。这些理论是我研究的参考框架。我根据儿童职业和技能组织儿童发展领域。然后，我根据儿童发育理论和视频制作原理开始计划，拍摄和编辑视频。在融洽的关系中，我描述了学习的所有阶段，并解释了这些阶段的顺序和内容。我还评估了我的学习过程。在观察视频中，您可以看到那些发展阶段，其中大约一岁的孩子基于我的研究框架。我相信，我的观察视频至少可以对激发职业治疗的学生学习儿童发育有帮助。关键字儿童发展，学习，观察视频

Few-Shot Learning of Video Action Recognition Only Based on Video Contents

摘要

著录项

相似文献

相关主题

期刊订阅