Skew Estimation by Improved Boundary Growing for Text Documents in South Indian Languages

Shivakumara P.; Nagabhushan P.; Hemantha Kumar G.; Manjunath Aradhya V. N.

首页> 外文期刊>Vivek >Skew Estimation by Improved Boundary Growing for Text Documents in South Indian Languages

【24h】

Skew Estimation by Improved Boundary Growing for Text Documents in South Indian Languages

机译：改进的边界增长对南印度语言文本文档的偏斜估计

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Estimating the inclination of lines in skewed documents made up of texts in south Indian languages (Kannada, Telugu, Tamil and Malayalam) is not as straight forward as computing the skew of text documents in English. This is due to additional modifier-characters, which get plugged in as bottom fixes or top fixes, or as extensions, that remain as disconnected protrusions of a main character. Under such circumstances direct application of Boundary Growing (BG) method would fail to perform accurately, hence we have proposed a corrective step employing Nearest Neighbor Clustering (NNC). BG and NNC jointly derive the coordinates to be input into moments computation to estimate the angle of inclination. The new model is tested on varieties of documents containing noisy texts, mixed with pictures, text in different resolutions which are composed in south Indian languages Kannada, Telugu, Tamil and Malayalam. For the purpose of contrasting, texts in English are also considered.

机译：估计由南印度语言（卡纳达语，泰卢固语，泰米尔语和马拉雅拉姆语）组成的倾斜文档中的行的倾斜度不如计算英语文本文档的倾斜度那样直接。这是由于附加的修饰符，它们作为底部固定或顶部固定，或作为扩展插入，并保留为主要角色的不连续突出部分。在这种情况下，直接应用边界增长（BG）方法将无法准确执行，因此我们提出了使用最近邻居聚类（NNC）的纠正步骤。 BG和NNC共同得出要输入到力矩计算中的坐标，以估计倾斜角度。该新模型在包含嘈杂文本，混合图片，各种分辨率的文本的各种文档上进行了测试，这些文本以南印度语卡纳达语，泰卢固语，泰米尔语和马拉雅拉姆语组成。为了对比，还考虑了英语文本。

著录项

来源
《Vivek》 |2006年第2期|共6页
作者
Shivakumara P.; Nagabhushan P.; Hemantha Kumar G.; Manjunath Aradhya V. N.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Skew Estimation by Improved Boundary Growing for Text Documents in South Indian Languages [J] . Shivakumara P., Nagabhushan P., Hemantha Kumar G., Vivek . 2006,第2期

机译：改进的边界增长对南印度语言文本文档的偏斜估计
2. SOUTH INDIAN TAMIL LANGUAGE HANDWRITTEN DOCUMENT TEXT LINE SEGMENTATION TECHNIQUE WITH AID OF SLIDING WINDOW AND SKEWING OPERATIONS [J] . SUNANDA DIXIT, Dr.H.N.SURESH Journal of Theoretical and Applied Information Technology . 2013,第2期

机译：南印度泰米尔语言手写文献文本线分割技术借助滑动窗口和偏斜操作
3. A novel boundary growing approach for accurate skew estimation of binary document images [J] . P. Shivakumara, G. Hemantha Kumar Pattern recognition letters . 2006,第7期

机译：一种新的边界增长方法，用于精确估计二进制文档图像的偏斜
4. An Efficient Skew Estimation Technique for Binary Document Images Based on Boundary Growing and Linear Regression Analysis [C] . P. Shivakumara, G. Hemantha Kumar, D.S. Guru, International Conference on Neural Information Processing(ICONIP 2004); 20041122-25; Calcutta(IN) . 2004

机译：基于边界增长和线性回归分析的二进制文档图像有效偏斜估计技术
5. Semantic Analysis for Improved Multi-document Summarization of Text. [D] . Israel, Quinsulon L. 2014

机译：改进的多文档文本摘要的语义分析。
6. Natural language processing of symptoms documented in free-text narratives of electronic health records: a systematic review [O] . Theresa A Koleck, Caitlin Dreisbach, Philip E Bourne, 2019

机译：电子病历的自由文本叙述中记录的症状的自然语言处理：系统综述
7. Language Independent Skew Estimation Technique Based on Gaussian Mixture Models: A Case Study on South Indian Scripts [O] . Manjunath Aradhya, V. N., Ashok Rao,, Hemantha Kumar, G. 2007

机译：基于高斯混合模型的独立于语言的偏度估计技术：以南印度文字为例
8. Effects of growing Indian military potential on South Asian stability. [R] . Bajwa, H. I. 2017

机译：印度军事潜力增长对南亚稳定的影响。

Skew Estimation by Improved Boundary Growing for Text Documents in South Indian Languages

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅