首页> 外文会议>IEEE International Conference on Computing, Communication and Security >Character Component Segmentation and Categorization of Machine Printed Text in Devanagari (Nepali) Script in Digital Image Processing
【24h】

Character Component Segmentation and Categorization of Machine Printed Text in Devanagari (Nepali) Script in Digital Image Processing

机译:数字图像处理中的梵文(尼泊尔文)脚本中机器打印文本的字符成分分割和分类

获取原文

摘要

Extraction of core character with related components, exactness and consistency of extracted symbol determines accuracy of any Digital Image Processing Systems. Furthermore, pre-categorization of extracted symbols reduces lots of processing load at classification level. This paper proposes a core character and its component segmentation and categorization method of Machine Printed Text in Devanagari-Nepali Script. This paper also presents a method that extract modifier components which are not connected to core character. Here, Shirorekha or Dika or header line is considered as major component of segmentation and categorization. The proposed model removes the Shirorekha or Dika using horizontal projection profile on word and label the image of word to extract the objects as components. We have supplied one object has one label. Some character loose some property at removal of Shirorekha. Thus, we have reconstructed character for exactness and consistency of extracted object. We have used a set of structural layout - height width ratio, appearance position in word, presence of Shirorekha over the extracted symbol to categorize the extracted objects. We categorize extracted symbol into five categories - Non-dika character, regular character, conjuncts, upper modifiers and lower modifiers. The result obtained shows we have an accuracy of 98.26% to 100% as compared to previous method.
机译:核心字符及其相关组件的提取,提取符号的准确性和一致性决定了任何数字图像处理系统的准确性。此外,提取符号的预分类减​​少了分类级别的大量处理负荷。本文提出了一种以梵文-尼泊尔文书写的机印文本的核心字符及其组成部分的分割和分类方法。本文还提出了一种提取与核心特征无关的修饰成分的方法。在此,将Shirorekha或Dika或标题行视为细分和分类的主要组成部分。所提出的模型使用单词上的水平投影轮廓来去除Shirorekha或Dika,并标记单词的图像以提取对象作为组成部分。我们提供的一个对象只有一个标签。移除Shirorekha时,某些角色会失去一些财产。因此,我们重构了特征,以确保提取对象的准确性和一致性。我们使用了一组结构布局-高宽比,单词中的出现位置,在提取的符号上出现Shirorekha来对提取的对象进行分类。我们将提取的符号分为五类-非dika字符,常规字符,合词,上位修饰符和下位修饰符。获得的结果表明,与以前的方法相比,我们的精度为98.26%至100%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号