首页> 外文OA文献 >A New Contour Based Invariant Feature Extraction Approach for the Recognition of Multi-lingual Documents
【2h】

A New Contour Based Invariant Feature Extraction Approach for the Recognition of Multi-lingual Documents

机译:一种新的基于轮廓的不变语言特征提取方法

摘要

Now a day, developing a single OCR system for recognizing multi-lingual documents becomes essential to enhance the ability and performance of the existing document analysis system. Hence in this paper, we present a new technique based on contour detection and distance measure for recognizing multi-lingual characters comprising south Indian languages (Kannada, Tamil, Telugu, Malayalam, English Upper case, English Lower case, English Numerals and Persian Alphanumeric). Proposed method finds boundary for a character using contour detection and the result of contour detection is given to feature extraction scheme to obtain distinct and invariant features for identifying different characters of different languages. The method extracts invariant features by computing distance between the centroid and the pixels of contour of character image. We compare the experimental results of proposed method with result of existing methods to evaluate the performance of the method. Based on experimental results it is realized that the proposed method gives 100% accuracy with minimum expense and time. In addition, the method is invariant to Rotation, Scaling and Translation transformations (RST).
机译:如今,开发用于识别多语言文档的单个OCR系统对于增强现有文档分析系统的功能和性能至关重要。因此,在本文中,我们提出了一种基于轮廓检测和距离测量的新技术,用于识别包括南印度语言(卡纳达语,泰米尔语,泰卢固语,马拉雅拉姆语,英语大写字母,英语小写字母,英语数字和波斯字母数字)的多语言字符。 。所提出的方法利用轮廓检测​​来找到字符的边界,并将轮廓检测的结果提供给特征提取方案,以获得用于识别不同语言的不同字符的明显和不变的特征。该方法通过计算质心图像的轮廓质心与像素之间的距离来提取不变特征。我们将提出的方法的实验结果与现有方法的结果进行比较,以评估该方法的性能。根据实验结果,可以认识到所提出的方法以最小的花费和时间给出了100%的精度。此外,该方法对于旋转,缩放和平移变换(RST)不变。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号