首页> 外文会议>IEEE International Conference on Image Processing >Atrous Temporal Convolutional Network for Video Action Segmentation
【24h】

Atrous Temporal Convolutional Network for Video Action Segmentation

机译:Atrous时间卷积网络用于视频动作分割

获取原文

摘要

Fine-grained temporal human action segmentation in untrimmed videos is receiving increasing attention due to its extensive applications in surveillance, robotics, and beyond. It is crucial for an action segmentation system to be robust to the temporal scale of different actions since in practical applications the duration of an action can vary from less than a second to tens of minutes. In this paper, we introduce a novel atrous temporal convolutional network (AT-Net), which explicitly generates multiscale video contextual representations by utilizing atrous temporal pyramid pooling (ATPP) and has an architecture of encoder-decoder fully convolutional network. In the decoding stage, AT-Net combines multiscale contextual features with low-level local features to generate high-quality action segmentation results. Experiments on the 50 Salads, GTEA and JIGSAWS benchmarks demonstrate that AT-Net achieves improvement over the state of the art.
机译:由于其在监视,机器人等领域的广泛应用,未经修饰的视频中的细粒度人为动作分割正受到越来越多的关注。动作分段系统对不同动作的时间范围的鲁棒性至关重要,因为在实际应用中,动作的持续时间可以从不到一秒到几十分钟不等。在本文中,我们介绍了一种新颖的非平稳时间卷积网络(AT-Net),该网络通过利用非平稳时间金字塔池(ATPP)显式生成多尺度视频上下文表示,并具有编码器-解码器全卷积网络的体系结构。在解码阶段,AT-Net将多尺度上下文特征与低级局部特征相结合,以生成高质量的动作分割结果。在50个色拉,GTEA和JIGSAWS基准测试中进行的实验表明,AT-Net在最先进的技术上取得了进步。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号