首页> 外文期刊>IEEE Transactions on Geoscience and Remote Sensing. >Dense Semantic Labeling of Subdecimeter Resolution Images With Convolutional Neural Networks
【24h】

Dense Semantic Labeling of Subdecimeter Resolution Images With Convolutional Neural Networks

机译:卷积神经网络对亚分米分辨率图像的密集语义标记

获取原文
获取原文并翻译 | 示例
           

摘要

Semantic labeling (or pixel-level land-cover classification) in ultrahigh-resolution imagery (<;10 cm) requires statistical models able to learn high-level concepts from spatial data, with large appearance variations. Convolutional neural networks (CNNs) achieve this goal by learning discriminatively a hierarchy of representations of increasing abstraction. In this paper, we present a CNN-based system relying on a downsample-then-upsample architecture. Specifically, it first learns a rough spatial map of high-level representations by means of convolutions and then learns to upsample them back to the original resolution by deconvolutions. By doing so, the CNN learns to densely label every pixel at the original resolution of the image. This results in many advantages, including: 1) the state-of-the-art numerical accuracy; 2) the improved geometric accuracy of predictions; and 3) high efficiency at inference time. We test the proposed system on the Vaihingen and Potsdam subdecimeter resolution data sets, involving the semantic labeling of aerial images of 9- and 5-cm resolution, respectively. These data sets are composed by many large and fully annotated tiles, allowing an unbiased evaluation of models making use of spatial information. We do so by comparing two standard CNN architectures with the proposed one: standard patch classification, prediction of local label patches by employing only convolutions, and full patch labeling by employing deconvolutions. All the systems compare favorably or outperform a state-of-the-art baseline relying on superpixels and powerful appearance descriptors. The proposed full patch labeling CNN outperforms these models by a large margin, also showing a very appealing inference time.
机译:超高分辨率图像(<; 10 cm)中的语义标记(或像素级别的土地覆盖分类)要求统计模型能够从空间数据中学习高级概念,并且外观变化很大。卷积神经网络(CNN)通过有区别地学习不断增加的抽象表示的层次结构来实现此目标。在本文中,我们提出了一种基于CNN的系统,该系统依赖于降采样-然后升采样架构。具体来说,它首先通过卷积学习高阶表示的粗略空间图,然后学习通过反卷积将它们上采样回到原始分辨率。通过这样做,CNN学会了以图像的原始分辨率密集地标记每个像素。这带来了许多优点,包括:1)最新的数值精度; 2)提高预测的几何精度; 3)推理时效率高。我们在Vaihingen和波茨坦亚分米分辨率数据集上测试了提出的系统,分别涉及9厘米和5厘米分辨率的航拍图像的语义标记。这些数据集由许多大型且带有完整注释的图块组成,从而可以利用空间信息对模型进行无偏评估。为此,我们将两种标准的CNN架构与建议的架构进行了比较:标准补丁分类,仅通过卷积预测局部标签补丁以及通过反卷积进行完整补丁标记。所有系统都依靠超像素和强大的外观描述符进行比较或优于现有基准。提议的全补丁标签CNN在很大程度上优于这些模型,还显示出非常吸引人的推理时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号