首页> 外文会议>Conference on Neural Information Processing Systems >LiteEval: A Coarse-to-Fine Framework for Resource Efficient Video Recognition

【24h】

LiteEval: A Coarse-to-Fine Framework for Resource Efficient Video Recognition

机译：Liteeval：资源高效视频识别的粗略框架

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents LiteEval, a simple yet effective coarse-to-fine framework for resource efficient video recognition, suitable for both online and offline scenarios. Exploiting decent yet computationally efficient features derived at a coarse scale with a lightweight CNN model, LiteEval dynamically decides on-the-fly whether to compute more powerful features for incoming video frames at a finer scale to obtain more details. This is achieved by a coarse LSTM and a fine LSTM operating cooperatively, as well as a conditional gating module to learn when to allocate more computation. Extensive experiments are conducted on two large-scale video benchmarks, FCVID and ActivityNet, and the results demonstrate LiteEval requires substantially less computation while offering excellent classification accuracy for both online and offline predictions.

机译：本文提出了LiteeVal，一个简单但有效的粗略对资源视频识别的粗略框架，适用于在线和离线方案。利用体面尚未使用轻量级CNN模型的粗略级别导出的计算上的尚效功能，LiteeVal是动态地决定是否在更精细的比例下计算更强大的功能，以获得更多细节。这是通过粗略LSTM和精细的LSTM来实现的，以及有条件的门控模块，以学习何时分配更多计算。广泛的实验是在两个大规模视频基准，FCVID和ActivityNet上进行的，并且结果证明了LiteeVal需要大量计算，同时为在线和离线预测提供出色的分类准确性。

著录项

来源
《Conference on Neural Information Processing Systems 》|2020年|p7160-7959|共10页
会议地点
作者
Zuxuan Wu; Caiming Xiong; Yu-Gang Jiang; Larry S. Davis;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计量学 ;
关键词

相似文献

外文文献
中文文献
专利

1. Supervised framework for automatic recognition and retrieval of interaction: a framework for classification and retrieving videos with similar human interactions [J] . C. Chattopadhyay, S. Das Computer Vision, IET . 2016 ,第3期

机译：受监管的自动识别和检索交互的框架：用于分类和检索具有类似人类交互作用的视频的框架
2. A coarse-to-fine framework to efficiently thwart plagiarism [J] . Zhang H., Chow T.W.S. Pattern Recognition: The Journal of the Pattern Recognition Society . 2011 ,第2期

机译：从粗到精的框架可以有效地制止窃
3. V-LPDR: Towards a unified framework for license plate detection, tracking, and recognition in real-world traffic videos [J] . Zhang Cong, Wang Qi, Li Xuelong Neurocomputing . 2021 ,第Auga18期

机译：V-LPDR：走向统一的车牌检测，跟踪和现实交通视频中的识别框架
4. LiteEval: A Coarse-to-Fine Framework for Resource Efficient Video Recognition [C] . Zuxuan Wu, Caiming Xiong, Yu-Gang Jiang, Conference on Neural Information Processing Systems . 2020

机译：Liteeval：资源高效视频识别的粗略框架
5. Robust and Efficient Activity Recognition from Videos [D] . Li, Xin. 2020

机译：来自视频的强大和高效的活动识别
6. A framework for the recognition of high-level surgical tasks from video images for cataract surgeries [O] . Florent Lalys, Laurent Riffaud, David Bouget, -1

机译：从视频图像到白内障手术的视频图像识别高级手术任务的框架
7. An action recognition framework for uncontrolled video capture based on a spatio-temporal video graph [O] . Jargalsaikhan Iveel 2017

机译：基于时空视频图的非受控视频捕获的动作识别框架

LiteEval: A Coarse-to-Fine Framework for Resource Efficient Video Recognition

摘要

著录项

相似文献

相关主题

期刊订阅