A two-stage temporal proposal network for precise action localization in untrimmed video

Wang Fei; Wang Guorui; Du Yuxuan; He Zhenquan; Jiang Yong

首页> 外文期刊>International journal of machine learning and cybernetics >A two-stage temporal proposal network for precise action localization in untrimmed video

【24h】

A two-stage temporal proposal network for precise action localization in untrimmed video

机译：一个两阶段时间建议网络，用于未经监测视频中的精确行动定位

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we propose a two-stage temporal proposal algorithm for the action detection task of long untrimmed videos. In the first stage, we propose a novel prior-minor watershed algorithm for action proposals with precise prior watershed proposal algorithm and minor supplementary sliding window algorithm. Here, we propose the correctness discriminator to fill the proposals that watershed proposal algorithm may omit with the sliding window proposals. In the second stage, an extended context pooling (ECP) is firstly proposed with two modules (internal and context). The context information module of ECP can structure the proposals and enhance the extended features of action proposals. Different level of ECP is introduced to model the action proposal region and make its extended context region more targeted and precise. Then, we propose a temporal context regression network, which adopts a multi-task loss to realize the training of the temporal coordinate regression and the action/background classification simultaneously, and outputs the precise temporal boundaries of the proposals. Here, we also propose prior-minor ranking to balance the effect of the prior watershed proposals and the minor supplementary proposals. On three large scale benchmarks THUMOS14, ActivityNet (v1.2 and v1.3), and Charades, our approach achieves superior performances compared with other state-of-the-art methods and runs over 1020 frames per second (fps) on a single NVIDIA Titan-X Pascal GPU, indicating that our method can efficiently improve the precision of action localization task.

机译：在本文中，我们提出了一种用于长虚拟视频的动作检测任务的两阶段时间提案算法。在第一阶段，我们提出了一种新的先前微小的流域算法，用于采用精确的先前流域提案算法和次要补充滑动窗口算法的行动提案。在这里，我们提出了正确的鉴别者来填补滑动窗口提案可以省略流域提案算法的提案。在第二阶段，首先提出了两个模块（内部和上下文）的扩展上下文池（ECP）。 ECP的上下文信息模块可以构建提案并增强行动提案的扩展功能。引入不同水平的ECP以模拟行动提案区域，并使其扩展的上下文区域更具针对性和精确。然后，我们提出了一个时间上下文回归网络，它采用多任务丢失来同时实现时间坐标回归和动作/背景分类的训练，并输出提案的精确时间边界。在这里，我们还提出了预先进行了次要的排名，以平衡前分水岭提案和未成年补充建议的效果。在三个大型基准测试Thumos14，ActivityNet（V1.2和V1.3）和Charades中，我们的方法与其他最先进的方法相比，实现了卓越的性能，并在单个中运行每秒超过1020帧（FPS） NVIDIA Titan-X Pascal GPU，表明我们的方法可以有效地提高行动本地化任务的精度。

著录项

来源
《International journal of machine learning and cybernetics》 |2021年第8期|2199-2211|共13页
作者
Wang Fei; Wang Guorui; Du Yuxuan; He Zhenquan; Jiang Yong;
展开▼
作者单位

Northeastern Univ Fac Robot Sci & Engn Shenyang 110169 Peoples R China;

Northeastern Univ Coll Informat Sci & Engn Shenyang 110819 Peoples R China;

Northeastern Univ Coll Informat Sci & Engn Shenyang 110819 Peoples R China;

Northeastern Univ Coll Informat Sci & Engn Shenyang 110819 Peoples R China;

Chinese Acad Sci Shenyang Inst Automat Shenyang 110016 Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Action detection; Correctness discriminator; Extended context pooling; Temporal context regression;

机译：行动检测;正确性判别器;扩展上下文池;时间上下文回归;

相似文献

外文文献
中文文献
专利

1. Temporal Action Localization in Untrimmed Videos Using Action Pattern Trees [J] . Song Hao, Wu Xinxiao, Zhu Bing, IEEE transactions on multimedia . 2019,第3期

机译：使用动作模式树在未修剪视频中进行时间动作本地化
2. Segment-Tube: Spatio-Temporal Action Localization in Untrimmed Videos with Per-Frame Segmentation [J] . Le Wang, Xuhuan Duan, Qilin Zhang, Sensors . 2018,第5期

机译：Segment-Tube：具有按帧分割的未修剪视频中的时空行为本地化
3. Graph-based temporal action co-localization from an untrimmed video [J] . Wang Le, Zhai Changbo, Zhang Qilin, Neurocomputing . 2021,第Apra28期

机译：基于图的时间作用来自未经监控的视频的共同定位
4. CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos [C] . Zheng Shou, Jonathan Chan, Alireza Zareian, IEEE Conference on Computer Vision and Pattern Recognition . 2017

机译：CDC：卷积解卷积网络，用于未修剪视频中的精确时间动作本地化
5. Generating Temporal Action Proposals in Long Untrimmed Videos [D] . Vaishnavi, Pratik 2018

机译：在未修剪的长视频中生成时间动作建议
6. Spatio-Temporal Action Detection in Untrimmed Videos by Using Multimodal Features and Region Proposals [O] . Yeongtaek Song, Incheol Kim 2019

机译：利用多峰特征和区域提议检测未修剪视频中的时空行为
7. CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos [O] . Shou, Zheng, Chan, Jonathan, Zareian, Alireza, 2017

机译：CDC：用于精确时间行动的卷积 - 反卷积网络未修剪视频中的本地化

A two-stage temporal proposal network for precise action localization in untrimmed video

摘要

著录项

相似文献

相关主题

期刊订阅