首页> 外文期刊>Neurocomputing >GAN-Based virtual-to-real image translation for urban scene semantic segmentation
【24h】

GAN-Based virtual-to-real image translation for urban scene semantic segmentation

机译:基于GaN的虚拟到实图像翻译城市场景语义细分

获取原文
获取原文并翻译 | 示例
       

摘要

Semantic image segmentation requires large amounts of pixel-wise labeled training data. Creating such data generally requires labor-intensive human manual annotation. Thus, extracting training data from video games is a practical idea, and pixel-wise annotation can be automated from video games with near perfect accuracy. However, experiments show that models trained using raw video-game data cannot be directly applied to real-world scenes because of the domain shift problem. In this paper, we propose a domain-adaptive network based on CycleGAN that translates scenes from a virtual domain to a real domain in both the pixel and feature spaces. Our contributions are threefold: 1) we propose a dynamic perceptual network to improve the quality of the generated images in the feature spaces, making the translated images are more conducive to semantic segmentation; 2) we introduce a novel weighted self-regularization loss to prevent semantic changes in translated images; and 3) we design a discrimination mechanism to coordinate multiple subnetworks and improve the overall training efficiency. We devise a series of metrics to evaluate the quality of translated images during our experiments on the public GTA-V (a video game dataset, i.e., the virtual domain) and Cityscapes (a real-world dataset, i.e., the real domain) and achieved notably improved results, demonstrating the efficacy of the proposed model. (C) 2019 Elsevier B.V. All rights reserved.
机译:语义图像分割需要大量的像素标记标记的训练数据。创建此类数据通常需要劳动密集型人类手册注释。因此,从视频游戏中提取培训数据是一个实际的想法,并且像素 - 明亮的注释可以自动从视频游戏中自动,靠近完美的准确性。然而,实验表明,由于域移位问题,使用原始视频游戏数据培训的模型不能直接应用于现实世界场景。在本文中,我们提出了一种基于Cycleangan的域 - 自适应网络,该网络可将场景从虚拟域转换为像素和特征空间中的真实域。我们的贡献是三倍:1)我们提出了一种动态感知网络,以提高特征空间中所生成的图像的质量,使翻译的图像更有利于语义分割; 2)我们介绍了一种小说加权自正规化损失,以防止翻译图像中的语义变化; 3)我们设计了一个歧视机制,以协调多个子网,提高整体培训效率。我们设计了一系列指标,以评估我们在公共GTA-V(视频游戏数据集,虚拟域)和城市景观(真实世界数据集,即真实域)和CityCapes的实验期间评估翻译图像的质量达到了显着改善的结果,展示了所提出的模型的功效。 (c)2019 Elsevier B.v.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号