A Robust Segmentation Technique for Line, Word and Character Extraction from Kannada Text in Low Resolution Display Board Images

机译：来自低分辨率显示板图像中的Kannada文本的线，Word和字符提取的强大分割技术

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Reliable extraction/segmentation of text lines, words and characters is one of the very important steps for development of automated systems for understanding the text in low resolution display board images. In this paper, a new approach for segmentation of text lines, words and characters from Kannada text in low resolution display board images is presented. The proposed method uses projection profile features and on pixel distribution statistics for segmentation of text lines. The method also detects text lines containing consonant modifiers and merges them with corresponding text lines, and efficiently separates overlapped text lines as well. The character extraction process computes character boundaries using vertical profile features for extracting character images from every text line. Further, the word segmentation process uses k-means clustering to group inter character gaps into character and word cluster spaces, which are used to compute thresholds for extracting words. The method also takes care of variations in character and word gaps. The proposed methodology is evaluated on a data set of 1008 low resolution images of display boards containing Kannada text captured from 2 mega pixel cameras on mobile phones at various sizes 240x320, 600x800 and 900x1200. The method achieves text line segmentation accuracy of 97.17%, word segmentation accuracy of 97.54% and character extraction accuracy of 99.09%. The proposed method is tolerant to font variability, spacing variations between characters and words, absence of free segmentation path due to consonant and vowel modifiers, noise and other degradations. The experimentation with images containing overlapped text lines has given promising results.

机译：文本线条，单词和字符的可靠提取/分割是用于了解低分辨率显示板图像中文本的自动化系统的非常重要的步骤之一。在本文中，提出了一种新方法，用于在低分辨率显示板图像中从kannada文本中分割文本线条，单词和字符。该方法使用投影型材特征和在像素分布统计上进行文本线的分割。该方法还检测包含辅音修饰符的文本线，并将它们与相应的文本行合并，并有效地分隔重叠的文本线。字符提取过程使用垂直轮廓特征计算字符边界，用于从每个文本线中提取字符图像。此外，单词分割过程使用k-means群集到组间字符间隙中的字符和单词簇空间，这些空间用于计算用于提取单词的阈值。该方法还负责特征和单词间隙的变化。所提出的方法是在包含在各种尺寸240x320,600x800和900x1200的移动电话上的kannada文本的显示板的1008个低分辨率图像的数据集的数据集。该方法达到了97.17％的文本线分割精度，字分割精度为97.54％，性格提取精度为99.09％。所提出的方法是容忍字体变异性，特征和单词之间的间距变化，由于辅音和元音改性剂，噪声和其他降级而没有自由分割路径。具有包含重叠文本线的图像的实验已经给出了有希望的结果。

著录项

来源
《International Conference on Signal and Image Processing》|2014年||共8页
会议地点
作者
Angadi S.A.; Kodabagi M.M.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信息处理（信息加工）;
关键词
Display Boards; K-Means Clustering; Low Resolution Images; Projection Profile Features; Segmentation;

机译：显示板;k均值聚类;低分辨率图像;投影型材特征;分段;

相似文献

外文文献
中文文献
专利

1. A Robust Segmentation Technique for Line, Word and Character Extraction from Kannada Text in Low Resolution Display Board Images [J] . S. A. Angadi, M. M. Kodabagi International Journal of Image and Graphics . 2014,第1a2期

机译：用于从低分辨率显示板图像中的卡纳达语文本中提取行，单词和字符的鲁棒分割技术
2. Word Extraction and Character Segmentation from Text Lines of Unconstrained Handwritten Bangla Document Images [J] . Ram Sarkar, Samir Malakar, Nibaran Das, Journal of Intelligent Systems . 2011,第3期

机译：从不受约束的手写孟加拉语文档图像的文本行中提取单词并进行字符分割
3. Text Character Extraction Implementation from Captured Handwritten Image to Text Conversionusing Template Matching Technique [J] . Seema Barate1, Chaitrali Kamthe1, Shweta Phadtare1, MATEC Web of Conferences . 2016,第2016期

机译：使用模板匹配技术从捕获的手写图像中提取文本字符以实现文本转换
4. A Robust Segmentation Technique for Line, Word and Character Extraction from Kannada Text in Low Resolution Display Board Images [C] . Angadi S.A., Kodabagi M.M. International Conference on Signal and Image Processing . 2014

机译：一种低分辨率显示板图像中从卡纳达语文本中提取行，单词和字符的鲁棒分割技术
5. Feature extraction in digitized images through image segmentation techniques. [D] . Prasadarao, Mokkarala V. 1983

机译：通过图像分割技术提取数字化图像中的特征。
6. Text Extraction from Scene Images by Character Appearance and Structure Modeling [O] . Chucai Yi, Yingli Tian -1

机译：通过字符外观和结构建模从场景图像提取文本
7. A Study of different Text Line Extraction Techniques for Multi-font and Multi-size Printed Kannada Documents [O] . R Prajna, Ramya V R, Mamatha H.R 2015

机译：多字体和多尺寸印刷kannada文档的不同文本线提取技术的研究

A Robust Segmentation Technique for Line, Word and Character Extraction from Kannada Text in Low Resolution Display Board Images

摘要

著录项

相似文献

相关主题

期刊订阅