Statistical modeling for the detection, localization and extraction of text from heterogeneous textual images using combined feature scheme

Chitrakala Gopalan; D. Manjula

首页> 外文期刊>Signal, Image and Video Processing >Statistical modeling for the detection, localization and extraction of text from heterogeneous textual images using combined feature scheme

【24h】

Statistical modeling for the detection, localization and extraction of text from heterogeneous textual images using combined feature scheme

机译：使用组合特征方案从异构文本图像中检测，定位和提取文本的统计模型

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Discriminating between the text and non text regions of an image is a complex and challenging task. In contrast to Caption text, Scene text can have any orientation and may be distorted by the perspective projection. Moreover, it is often affected by variations in scene and camera parameters such as illumination, focus, etc. These variations make the design of unified text extraction from various kinds of images extremely difficult. This paper proposes a statistical unified approach for the extraction of text from hybrid textual images (both Scene text and Caption text in an image) and Document images with variations in text by using carefully selected features with the help of multi level feature priority (MLFP) algorithm. The selected features are combinedly found to be the good choice of feature vectors and have the efficacy to discriminate between text and non text regions for Scene text, Caption text and Document images and the proposed system is robust to illumination, transformation/perspective projection, font size and radially changing/angular text. MLFP feature selection algorithm is evaluated with three common ML algorithms: a decision tree inducer (C4.5), a naive Bayes classifier, and an instance based K-nearest neighbour learner and effectiveness of MLFP is shown by comparing with three feature selection methods with benchmark dataset. The proposed text extraction system is compared with the Edge based method, Connected component method and Texture based method and shown encouraging result and finds its major application in preprocessing for optical character recognition technique and multimedia processing, mobile robot navigation, vehicle license detection and recognition, page segmentation and text-based image indexing, etc.

机译：区分图像的文本区域和非文本区域是一项复杂而具有挑战性的任务。与字幕文本相反，场景文本可以具有任何方向，并且可能因透视投影而变形。此外，它通常受场景和摄像机参数（例如照明，焦点等）变化的影响。这些变化使从各种图像中提取统一文本的设计极为困难。本文提出了一种统计统一的方法，该方法通过在多级特征优先级（MLFP）的帮助下使用精心选择的特征，从混合文本图像（图像中的场景文本和标题文本）和文本中具有变化的文档图像中提取文本算法。选定的特征被认为是特征向量的良好选择，并且具有区分场景文本，标题文本和文档图像的文本区域和非文本区域的功效，并且所提出的系统对照明，变换/透视投影，字体具有鲁棒性大小和径向更改/有角文本。通过三种常见的ML算法对MLFP特征选择算法进行评估：决策树诱导器（C4.5），朴素贝叶斯分类器和基于实例的K最近邻学习器，并通过与以下三种特征选择方法进行比较来显示MLFP的有效性基准数据集。所提出的文本提取系统与基于Edge的方法，基于连接组件的方法和基于Texture的方法进行了比较，并显示出令人鼓舞的结果，并在光学字符识别技术和多媒体处理，移动机器人导航，车辆牌照检测和识别，页面分割和基于文本的图像索引等

著录项

来源
《Signal, Image and Video Processing》 |2011年第2期|p.165-183|共19页
作者
Chitrakala Gopalan; D. Manjula;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
入库时间 2022-08-18 02:29:56

相似文献

外文文献
中文文献
专利

1. Statistical modeling for the detection, localization and extraction of text from heterogeneous textual images using combined feature scheme - Springer [J] . Chitrakala Gopalan, D. Manjula Signal, Image and Video Processing . 2011,第2期

机译：使用组合特征方案从异类文本图像中检测，定位和提取文本的统计模型-Springer
2. Multi Level Feature Priority algorithm based text extraction from heterogeneous and hybrid textual images [J] . Gopalan Chitrakala, D. Manjula International Journal of Signal and Imaging Systems Engineering . 2009,第4期

机译：基于多级特征优先算法的异类和混合文本图像文本提取
3. A novel statistical feature extraction method for textual images: Optical font recognition [J] . Bilal Bataineh, Siti Norul Huda Sheikh Abdullah, Khairuddin Omar Expert Systems with Application . 2012,第5期

机译：一种新的文本图像统计特征提取方法：光学字体识别
4. Effective Candidate Component Extraction for Text Localization in Born-Digital Images by Combining Text Contours and Stroke Interior Regions [C] . Kai Chen, Fei Yin, Cheng-Lin Liu IAPR International Workshop on Document Analysis Systems . 2016

机译：通过结合文本轮廓和笔触内部区域来有效地提取数字图像中文本的候选成分
5. Detection schemes for synthetic-aperture radar imagery based on a beta prime statistical model. [D] . Salazar, Jose Salomon, II. 1999

机译：基于beta原始统计模型的合成孔径雷达图像检测方案。
6. Biomedical Imaging Modality Classification Using Combined Visual Features and Textual Terms [O] . Xian-Hua Han, Yen-Wei Chen 2011

机译：使用结合的视觉特征和文本术语的生物医学成像模式分类
7. A localization/verification scheme for finding text in images and video frames based on contrast independent features and machine learning methods [O] . Datong Chen, Jean-Marc Odobez, Jean-Philippe Thiran 2004

机译：基于对比度独立特征和机器学习方法的图像和视频帧中查找文本的本地化/验证方案
8. Combined optimization of image-gathering and image-processing systems for scene feature detection [R] . Halyo, Nesim, Arduini, Robert F., Samms, Richard W. 1987

机译：用于场景特征检测的图像采集和图像处理系统的组合优化

Statistical modeling for the detection, localization and extraction of text from heterogeneous textual images using combined feature scheme

摘要

著录项

相似文献

相关主题

期刊订阅