Using Convolutional Encoder-Decoder for Document Image Binarization

机译：使用卷积编码器解码器进行文档图像二值化

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Document image binarization is one of the critical initial steps for document analysis and understanding. Previous work mostly focused on exploiting hand-crafted features to build statistical models for distinguishing text from background. However, these approaches only achieved limited success because: (a) the effectiveness of hand-crafted features is limited by the researcher's domain knowledge and understanding on the documents, and (b) a universal model cannot always capture the complexity of different document degradations. In order to address these challenges, we propose a convolutional encoder-decoder model with deep learning for document image binarization in this paper. In the proposed method, mid-level document image representations are learnt by a stack of convolutional layers, which compose the encoder in this architecture. Then the binarization image is obtained by mapping low resolution representations to the original size through the decoder, which is composed by a series of transposed convolutional layers. We compare the proposed binarization method with other binarization algorithms both qualitatively and quantitatively on the public dataset. The experimental results show that the proposed method has comparable performance to the other hand-crafted binarization approaches and has more generalization capabilities with limited in-domain training data.

机译：文档图像二值化是文档分析和理解的关键初始步骤之一。以前的工作主要集中在利用手工制作的功能来构建统计模型，以区分从背景中的文本。然而，这些方法只取得了有限的成功，因为：（a）手工制作功能的有效性受到研究人员的域知识和对文件的理解的限制，（b）通用模型不能总是捕捉不同文档降级的复杂性。为了解决这些挑战，我们提出了一种卷积编码器 - 解码器模型，对本文进行了深入学习的文档图像二值化。在所提出的方法中，中级文档图像表示由一堆卷积层学习，该卷积层，该层组成在此架构中的编码器。然后通过将低分辨率表示通过解码器将低分辨率表示映射到原始尺寸来获得二值化图像，该解码器由一系列转档卷积层组成。我们将建议的二值化方法与其他二值化算法进行比较，在公共数据集上定性和定量。实验结果表明，该方法对其他手工制作的二值化方法具有相当的性能，具有更多泛化能力，具有有限的域培训数据。

著录项

来源
《IAPR International Conference on Document Analysis and Recognition》|2017年|732p|共6页
会议地点
作者
Xujun Peng; Huaigu Cao; Prem Natarajan;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP391.41-53;
关键词
Neural networks; Feature extraction; Task analysis; Decoding; Image segmentation; Computer architecture; Semantics;

机译：神经网络;特征提取;任务分析;解码;图像分割;计算机架构;语义;

相似文献

外文文献
中文文献
专利

1. Binarized Encoder-Decoder Network and Binarized Deconvolution Engine for Semantic Segmentation [J] . Hyunwoo Kim, Jeonghoon Kim, Jungwook Choi, Quality Control, Transactions . 2021,第1期

机译：用于语义分割的二值化编码器 - 解码器网络和二值化解卷发动机
2. Automatic segmentation of intracerebral hemorrhage in CT images using encoder-decoder convolutional neural network [J] . Kai Hu, Kai Chen, Xizhi He, Information Processing & Management . 2020,第6期

机译：使用编码器解码器卷积神经网络自动分割CT图像中的脑内出血
3. Height estimation from single aerial images using a deep convolutional encoder-decoder network [J] . Amirkolaee Hamed Amini, Arefi Hossein ISPRS Journal of Photogrammetry and Remote Sensing . 2019,第MARa期

机译：使用深度卷积编码器/解码器网络从单个航拍图像进行高度估计
4. Using Convolutional Encoder-Decoder for Document Image Binarization [C] . Xujun Peng, Huaigu Cao, Prem Natarajan . 2017

机译：使用卷积编码器/解码器进行文档图像二值化
5. Effective and efficient binarization of degraded document images. [D] . Parker, Jon Ivan. 2016

机译：对退化的文档图像进行有效和高效的二值化。
6. Robust Combined Binarization Method of Non-Uniformly Illuminated Document Images for Alphanumerical Character Recognition [O] . Hubert Michalak, Krzysztof Okarma 2020

机译：非均匀照明文档图像的鲁棒组合二值化方法用于字母数字字符识别
7. Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images [O] . Younes Akbari, Somaya Al-Maadeed, Kalthoum Adam 2020

机译：使用卷积神经网络和基于小波的多通道图像的降级文档图像的二值化

Using Convolutional Encoder-Decoder for Document Image Binarization

摘要

著录项

相似文献

相关主题

期刊订阅