...
首页> 外文期刊>Advances in Computer Science and Information Technology: ACSIT >Evaluation of Neural Based Feature Extraction Methods for Printed Telugu OCR System
【24h】

Evaluation of Neural Based Feature Extraction Methods for Printed Telugu OCR System

机译:印刷遥控系统的神经基特征提取方法的评价

获取原文
   

获取外文期刊封面封底 >>

       

摘要

The Telugu is one of the oldest and most popular languages of India, especially in South India. The reported works on development of optical character recognition (OCR) systems for Telugu script is little. Moreover, Telugu is a complex script in which the characters are made up of one or more connected components resulting in a huge number of possible combinations, running into hundreds of thousands. In any OCR system, feature extraction is one of the most important phases. There are several methods that are suitable for different language scripts. These methods are broadly classified into template base, structural, statistical, neural network based and SVM based. In this paper we describe various feature extraction methods and evaluate by applying to Telugu script. In this process we have identified diagonal based, geometrical based and distance metric based feature extraction methods and also proposed a Pixel based feature extraction method. All these methods are implemented and evaluated with 364 Telugu characters using multilayer neural network as a classifier. The recognition accuracies of geometrical, diagonal, pixelmap and distance metric based feature extraction methods are 98.6%, 100%, 98.31% and 99.32% respectively. From the experiment it is understood that diagonal based method most suitable for Telugu script than other feature extraction methods.
机译:Telugu是印度最古老,最流行的语言之一,特别是在印度南部。报道的关于泰卢固定脚本的光学字符识别(OCR)系统的开发工作很少。此外,Telugu是一个复杂的脚本,其中字符由一个或多个连接组件组成,从而产生大量可能的组合,运行成数十万。在任何OCR系统中,特征提取是最重要的阶段之一。有几种方法适用于不同的语言脚本。这些方法广泛分为模板基础,结构,统计,神经网络基于SVM。在本文中,我们描述了各种特征提取方法,并通过申请Teludu脚本来评估。在该过程中,我们已经识别基于对角线的基于几何和距离公制的特征提取方法,并且还提出了一种基于像素的特征提取方法。使用多层神经网络作为分类器的364个Teludu字符来实现和评估所有这些方法。几何,对角线,PIXELMAP和距离公制的特征提取方法的识别精度分别为98.6%,100%,98.31%和99.32%。从实验开始,据了解,基于对角线的方法,最适合Telugu脚本的方法比其他特征提取方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号