Spatial Structure Preserving Feature Pyramid Network for Semantic Image Segmentation

YUAN YUAN; JIE FANG; XIAOQIANG LU; YACHUANG FENG

首页> 外文期刊>ACM transactions on multimedia computing communications and applications >Spatial Structure Preserving Feature Pyramid Network for Semantic Image Segmentation

【24h】

Spatial Structure Preserving Feature Pyramid Network for Semantic Image Segmentation

机译：保存空间结构特征金字塔网络用于语义图像分割

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Recently, progress on semantic image segmentation is substantial, benefiting from the rapid development of Convolutional Neural Networks. Semantic image segmentation approaches proposed lately have been mostly based on Fully convolutional Networks (FCNs). However, these FCN-based methods use large receptive fields and too many pooling layers to depict the discriminative semantic information of the images. Specifically, on one hand, convolutional kernel with large receptive field smooth the detailed edges, since too much contexture information is used to depict the "center pixel." However, the pooling layer increases the receptive field through zooming out the latest feature maps, which loses many detailed information of the image, especially in the deeper layers of the network. These operations often cause low spatial resolution inside deep layers, which leads to spatially fragmented prediction. To address this problem, we exploit the inherent multi-scale and pyramidal hierarchy of deep convolutional networks to extract the feature maps with different resolutions and take full advantages of these feature maps via a gradually stacked fusing way. Specifically, for two adjacent convolutional layers, we upsample the features from deeper layer with stride of 2 and then stack them on the features from shallower layer. Then, a convolutional layer with kernels of 1 ⅹ 1 is followed to fuse these stacked features. The fused feature preserves the spatial structure information of the image; meanwhile, it owns strong discriminative capability for pixel classification. Additionally, to further preserve the spatial structure information and regional connectivity of the predicted category label map, we propose a novel loss term for the network. In detail, two graph model-based spatial affinity matrixes are proposed, which are used to depict the pixel-level relationships in the input image and predicted category label map respectively, and then their cosine distance is backward propagated to the network. The proposed architecture, called spatial structure preserving feature pyramid network, significantly improves the spatial resolution of the predicted category label map for semantic image segmentation. The proposed method achieves state-of-the-art results on three public and challenging datasets for semantic image segmentation.

机译：最近，语义图像分割的进步是很大的，从卷积神经网络的快速发展中受益。最近提出的语义图像分割方法主要基于完全卷积网络（FCN）。然而，基于FCN的方法使用大的接收领域和太多的汇集层来描绘图像的辨别性语义信息。具体而言，一方面，具有大容器的卷积核，具有大的详细边缘，因为使用太多的上下文信息来描绘“中心像素”。然而，汇集层通过缩放最新的特征映射来增加接收领域，其丢失图像的许多详细信息，尤其是在网络的更深层中。这些操作经常导致深层内部的低空间分辨率，这导致空间碎片预测。为了解决这个问题，我们利用深度卷积网络的固有的多尺度和金字塔层次结构来提取具有不同分辨率的特征映射，并通过逐步堆叠的融合方式实现这些特征映射的全部优势。具体地，对于两个相邻的卷积层，我们将来自深层层的特征升高，然后将它们堆叠在较浅层的特征上。然后，遵循具有1÷1的核的卷积层，以熔化这些堆叠的特征。融合功能保留图像的空间结构信息;同时，它拥有对像素分类的强烈辨别能力。另外，为了进一步保留预测类别标签地图的空间结构信息和区域连接，我们提出了网络的新丢失项。详细地，提出了两个基于图形的空间亲和矩阵，其用于分别描绘输入图像和预测类别标签地图中的像素级关系，然后将其余弦距离向后传播到网络。所提出的架构，称为空间结构保留特征金字塔网络，显着提高了用于语义图像分割的预测类别标签映射的空间分辨率。所提出的方法实现了三个公共和具有挑战性的数据集的最先进的结果，用于语义图像分割。

著录项

来源
《ACM transactions on multimedia computing communications and applications》 |2019年第3期|73.1-73.19|共19页
作者
YUAN YUAN; JIE FANG; XIAOQIANG LU; YACHUANG FENG;
展开▼
作者单位

Center for OPTical Imagery Analysis and Learning (OPTIMAL) Northwestern Polytechnical University China;

Key Laboratory of Spectral Imaging Technology CAS Xi'an Institute of Optics and Precision Mechanics Chinese Academy of Sciences China and University of Chinese Academy of Sciences China;

Key Laboratory of Spectral Imaging Technology CAS Xi'an Institute of Optics and Precision Mechanics Chinese Academy of Sciences China and University of Chinese Academy of Sciences China;

Key Laboratory of Spectral Imaging Technology CAS Xi'an Institute of Optics and Precision Mechanics Chinese Academy of Sciences China and University of Chinese Academy of Sciences China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Semantic image segmentation; spatial resolution; feature pyramid net-work; discriminative capability;

机译：语义图像分割;空间分辨率;特征金字塔净工作;辨别能力;

相似文献

外文文献
中文文献
专利

1. AtICNet: semantic segmentation with atrous spatial pyramid pooling in image cascade network [J] . Jin Chen, Chuanya Wang, Ying Tong Eurasip Journal on Wireless Communications and Networking . 2019,第1期

机译：ATICNET：图像级联网络中的占空空间金字塔池的语义分割
2. AtICNet: semantic segmentation with atrous spatial pyramid pooling in image cascade network [J] . Jin Chen, Chuanya Wang, Ying Tong Eurasip Journal on Wireless Communications and Networking . 2019,第1期

机译：ATICNET：图像级联网络中的居住空间金字塔池的语义分割
3. DUAL PYRAMIDS ENCODER-DECODER NETWORK FOR SEMANTIC SEGMENTATION IN GROUND AND AERIAL VIEW IMAGES [J] . S. L. Jiang, G. Li, W. Yao, International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences . 2020,第4期

机译：用于地面和鸟瞰图图像中的语义分割的双金字塔编码器 - 解码器网络
4. Semantic Segmentation of Breast Ultrasound Image with Pyramid Fuzzy Uncertainty Reduction and Direction Connectedness Feature [C] . Kuan Huang, Yingtao Zhang, H. D. Cheng, International Conference on Pattern Recognition . 2021

机译：金字塔模糊不确定性减小和方向关联特征的乳房超声图像的语义分割
5. Multi-Scale Object Detection in Aerial Images with Feature Pyramid Networks [D] . Bhattarai, Sujan 2018

机译：具有特征金字塔网络的航空影像多尺度目标检测
6. Volumetric Semantic Segmentation using Pyramid Context Features [O] . Jonathan T. Barron, Pablo Arbeláez, Soile V. E. Keränen, -1

机译：使用金字塔上下文特征的体积语义分割
7. Road Segmentation for Remote Sensing Images Using Adversarial Spatial Pyramid Networks [O] . Pourya Shamsolmoali, Masoumeh Zareapoor, Huiyu Zhou, 2021

机译：使用对抗空间金字塔网络的遥感图像的道路分割

Spatial Structure Preserving Feature Pyramid Network for Semantic Image Segmentation

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅