Human Parsing with Contextualized Convolutional Neural Network

Xiaodan Liang; Chunyan Xu; Xiaohui Shen; Jianchao Yang; Jinhui Tang; Liang Lin; Shuicheng Yan

首页> 外文期刊>IEEE Transactions on Pattern Analysis and Machine Intelligence >Human Parsing with Contextualized Convolutional Neural Network

【24h】

Human Parsing with Contextualized Convolutional Neural Network

机译：上下文解析卷积神经网络的人解析

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In this work, we address the human parsing task with a novel Contextualized Convolutional Neural Network (Co-CNN) architecture, which well integrates the cross-layer context, global image-level context, semantic edge context, within-super-pixel context and cross-super-pixel neighborhood context into a unified network. Given an input human image, Co-CNN produces the pixel-wise categorization in an end-to-end way. First, the cross-layer context is captured by our basic local-to-global-to-local structure, which hierarchically combines the global semantic information and the local fine details across different convolutional layers. Second, the global image-level label prediction is used as an auxiliary objective in the intermediate layer of the Co-CNN, and its outputs are further used for guiding the feature learning in subsequent convolutional layers to leverage the global image-level context. Third, semantic edge context is further incorporated into Co-CNN, where the high-level semantic boundaries are leveraged to guide pixel-wise labeling. Finally, to further utilize the local super-pixel contexts, the within-super-pixel smoothing and cross-super-pixel neighbourhood voting are formulated as natural sub-components of the Co-CNN to achieve the local label consistency in both training and testing process. Comprehensive evaluations on two public datasets well demonstrate the significant superiority of our Co-CNN over other state-of-the-arts for human parsing. In particular, the F-1 score on the large dataset [1] reaches 81.72percent by Co-CNN, significantly higher than 62.81percent and 64.38percent by the state-of-the-art algorithms, M-CNN [2] and ATR [1] , respectively. By utilizing our newly collected large dataset for training, our Co-CNN can achieve 85.36percent in F-1 score.

机译：在这项工作中，我们使用新颖的上下文卷积神经网络（Co-CNN）架构解决了人类的解析任务，该架构很好地集成了跨层上下文，全局图像级上下文，语义边缘上下文，超像素内上下文和跨超像素邻域上下文整合为统一网络。给定输入的人类图像，Co-CNN以端到端的方式生成像素级分类。首先，跨层上下文是由我们的基本本地到全局到本地结构捕获的，该结构将全局语义信息和跨不同卷积层的本地精细细节分层组合。其次，全局图像级标签预测在Co-CNN的中间层中用作辅助目标，并且其输出还用于指导后续卷积层中的特征学习以利用全局图像级上下文。第三，将语义边缘上下文进一步合并到Co-CNN中，其中利用高级语义边界来指导按像素标注。最后，为了进一步利用局部超像素上下文，将超像素内平滑和跨超像素邻域投票公式化为Co-CNN的自然子组件，以在训练和测试中实现局部标签一致性处理。对两个公共数据集的综合评估很好地证明了我们的Co-CNN优于其他最新的人类解析技术。尤其是，大型数据集[1]上的F-1分数通过Co-CNN达到81.72％，明显高于最新算法，M-CNN [2]和ATR的62.81％和64.38％ [1]。通过利用我们新收集的大型数据集进行训练，我们的Co-CNN可以达到F-1分数的85.36％。

著录项

来源
《IEEE Transactions on Pattern Analysis and Machine Intelligence》 |2017年第1期|115-127|共13页
作者
Xiaodan Liang; Chunyan Xu; Xiaohui Shen; Jianchao Yang; Jinhui Tang; Liang Lin; Shuicheng Yan;
展开▼
作者单位

School of Data and Computer Science, Sun Yat-sen University, China;

Huazhong University of Science and Technology, School of Computer Science, Wuhan, Hubei, China;

Adobe Research, San Jose, CA;

Adobe Research, San Jose, CA;

School of Computer Science and Engineering, Nanjing University of Science and Technology, China;

School of Data and Computer Science, Sun Yat-sen University, China;

Department of Electrical and Computer Engineering, National University of Singapore, Singapore;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Semantics; Context; Image edge detection; Labeling; Smoothing methods; Image segmentation; Training;

机译：语义;上下文;图像边缘检测;标签;平滑方法;图像分割;训练;

相似文献

外文文献
中文文献
专利

1. A Multi-Task Framework for Facial Attributes Classification through End-to-End Face Parsing and Deep Convolutional Neural Networks [J] . Nature reviews Cancer . 2020,第2期

机译：通过端到端脸部解析和深卷积神经网络分类的面部属性的多任务框架
2. Combining weighted category-aware contextual information in convolutional neural networks for text classification [J] . Xin Wu, Yi Cai, Qing Li, World Wide Web . 2020,第5期

机译：将加权类别感知的上下文信息组合在卷积神经网络中的文本分类
3. Automatic political discourse analysis with multi-scale convolutional neural networks and contextual data: [J] . Aritz Bilbao-Jayo, Aitor Almeida International Journal of Distributed Sensor Networks . 2018,第11期

机译：具有多尺度卷积神经网络和上下文数据的自动政治话语分析：
4. Human Parsing with Contextualized Convolutional Neural Network [C] . Xiaodan Liang, Chunyan Xu, Xiaohui Shen, IEEE International Conference on Computer Vision . 2015

机译：使用上下文卷积神经网络进行人解析
5. BeiimNet: Semi-Supervised Contextually Guided Convolutional Neural Networks [D] . Sarsekeyev, Beiimbet. 2021

机译：Beiimnet：半监督上下文引导的卷积神经网络
6. A Multi-Task Framework for Facial Attributes Classification through End-to-End Face Parsing and Deep Convolutional Neural Networks [O] . Khalil Khan, Muhammad Attique, Rehan Ullah Khan, 2020

机译：通过端到端脸部分析和深度卷积神经网络进行面部属性分类的多任务框架
7. A Multi-Task Framework for Facial Attributes Classification through End-to-End Face Parsing and Deep Convolutional Neural Networks [O] . Khalil Khan, Muhammad Attique, Rehan Ullah Khan, 2020

机译：通过端到端脸部解析和深卷积神经网络分类的面部属性的多任务框架

Human Parsing with Contextualized Convolutional Neural Network

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅