Separation of Foreground Text from Complex Background in Color Document Images

机译：从复杂背景中的前景文本的分离在彩色文档图像中

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Reading of the foreground text is difficult in documents having multi colored complex background. Automatic foreground text separation in such document images is very much essential for smooth reading of the document contents. In this paper we propose a hybrid approach which combines connected component analysis and an unsupervised thresholding for separation of text from the complex background. The proposed approach identifies the candidate text regions based on edge detection followed by a connected component analysis. Because of background complexity it is also possible that a non text region may be identified as a text region. To overcome this problem we extract texture features of connected components and analyze the feature values. Finally the threshold value for each detected text region is derived automatically from the data of corresponding image region to perform foreground separation. The proposed approach can handle document images with varying background of multiple colors. Also it can handle foreground text of any color, font and size. Experimental results show that the proposed algorithm detects on an average 97.8% of text regions in the source document. Readability of the extracted foreground text is illustrated through OCRing.

机译：在具有多色复杂背景的文档中读取前景文本很难。在此类文档图像中的自动前景文本分离对于流畅的读取文档内容非常重要。在本文中，我们提出了一种混合方法，它结合了连接的分量分析和无监督的阈值，以便从复杂背景中分离文本。所提出的方法基于边缘检测识别候选文本区域，然后是连接的分量分析。由于背景复杂性，也可以将非文本区域识别为文本区域。为了克服这个问题，我们提取连接组件的纹理功能并分析特征值。最后，每个检测到的文本区域的阈值是从相应图像区域的数据自动导出的，以执行前景分离。该方法可以处理具有多种颜色的不同背景的文档图像。它也可以处理任何颜色，字体和大小的前景文本。实验结果表明，该算法在源文档中平均检测到平均97.8％的文本区域。提取的前景文本的可读性通过occring说明。

著录项

来源
《International Conference on Advances in Pattern Recognition》|2008年||共4页
会议地点
作者
Shivananda Nirmala; Nagabhushan P.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP391.4-53;
关键词
Color document image; Complex background; Connected component analysis; Feature extraction; OCR.; Text separation; Thresholding;

机译：彩色文档图像;复杂背景;连接的分量分析;特征提取;OCR。;文本分离;阈值;

相似文献

外文文献
中文文献
专利

1. Foreground text segmentation in complex color document images using Gabor filters - Springer [J] . S. Nirmala, P. Nagabhushan Signal, Image and Video Processing . 2012,第4期

机译：使用Gabor滤镜的复杂彩色文档图像中的前景文本分割-Springer
2. Text/Background separation in the degraded document images by combining several thresholding techniques [J] . ABDERRAHMANE KEFALI, TOUFIK SARI, HALIMA BAHI WSEAS Transactions on Signal Processing . 2014,第Pta1期

机译：通过结合多种阈值化技术，在降级的文档图像中实现文本/背景分离
3. Text line extraction in graphical documents using background and foreground information [J] . Partha Pratim Roy, Umapada Pal, Josep Llados International Journal on Document Analysis and Recognition . 2012,第3期

机译：使用背景和前景信息提取图形文档中的文本行
4. Separation of Foreground Text from Complex Background in Color Document Images [C] . Shivananda Nirmala, Nagabhushan P. International Conference on Advances in Pattern Recognition . 2008

机译：从复杂背景中的前景文本的分离在彩色文档图像中
5. Markov random field model based text segmentation and image post processing of complex scanned documents [D] . Haneda, Eri 2011

机译：基于马尔可夫随机场模型的复杂扫描文档的文本分割和图像后处理
6. Using Single Colors and Color Pairs to Communicate Basic Tastes II: Foreground–Background Color Combinations [O] . Andy T. Woods, Fernando Marmolejo-Ramos, Carlos Velasco, 2016

机译：使用单色和颜色对传达基本口味II：前景-背景颜色组合
7. Complex Background and Foreground Extraction in Color Document Images using Interval Type-2 Fuzzy [O] . P. Murugeswari 2013

机译：区间2型模糊数据在彩色文档图像中的复杂背景和前景提取

Separation of Foreground Text from Complex Background in Color Document Images

摘要

著录项

相似文献

相关主题

期刊订阅