【24h】

Dynamic Video Segmentation Network

机译:动态视频分割网络

获取原文

摘要

In this paper, we present a detailed design of dynamic video segmentation network (DVSNet) for fast and efficient semantic video segmentation. DVSNet consists of two convolutional neural networks: a segmentation network and a flow network. The former generates highly accurate semantic segmentations, but is deeper and slower. The latter is much faster than the former, but its output requires further processing to generate less accurate semantic segmentations. We explore the use of a decision network to adaptively assign different frame regions to different networks based on a metric called expected confidence score. Frame regions with a higher expected confidence score traverse the flow network. Frame regions with a lower expected confidence score have to pass through the segmentation network. We have extensively performed experiments on various configurations of DVSNet, and investigated a number of variants for the proposed decision network. The experimental results show that our DVSNet is able to achieve up to 70.4% mIoU at 19.8 fps on the Cityscape dataset. A high speed version of DVSNet is able to deliver an fps of 30.4 with 63.2% mIoU on the same dataset. DVSNet is also able to reduce up to 95% of the computational workloads.
机译:在本文中,我们提出了一种用于快速有效的语义视频分割的动态视频分割网络(DVSNet)的详细设计。 DVSNet由两个卷积神经网络组成:分段网络和流网络。前者生成高度准确的语义分段,但深度和速度较慢。后者比前者快得多,但是它的输出需要进一步处理以生成不太准确的语义分段。我们探索使用决策网络根据称为预期置信度得分的指标将不同的帧区域自适应地分配给不同的网络。具有较高预期置信度得分的框架区域遍历流动网络。预期置信度得分较低的框架区域必须通过分割网络。我们在DVSNet的各种配置上进行了广泛的实验,并研究了所建议决策网络的许多变体。实验结果表明,我们的DVSNet可以在Cityscape数据集上以19.8 fps的速度达到70.4%的mIoU。高速版本的DVSNet可以在同一数据集上以63.2%的mIoU提供30.4的fps。 DVSNet还能够减少多达95%的计算工作量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号