首页> 外文会议>International Conference on Intelligent Systems and Signal Processing >An OCR for separation and identification of mixed English — Gujarati digits using kNN classifier
【24h】

An OCR for separation and identification of mixed English — Gujarati digits using kNN classifier

机译:使用KNN分类器分离和识别混合英语 - 古吉拉特数字的OCR

获取原文

摘要

This paper addresses the script identification problem of bilingual printed document images. We propose an OCR system that separates and identify mixed English-Gujarati digits. Here, first the system is trained with standard data samples. Then for testing, data samples are collected from different sources of paper like, news paper, book, magazine, etc. Random sized pre-processed image is normalized to uniform sized image. A statistical approach is used for feature extraction. For classification kNN classifier is used. The model gives average accuracy of 99.26% for Gujarati digits, 99.20% for English digits, and overall accuracy 99.23%.
机译:本文讨论了双语印刷文档图像的脚本识别问题。我们提出了一个分开和识别混合英语 - 古吉拉拉提数字的OCR系统。在这里,首先,系统培训具有标准数据样本。然后用于测试,从不同的纸张中收集数据样本,如新闻报道,书籍,杂志等。随机大小的预处理图像被标准化为均匀的尺寸图像。统计方法用于特征提取。用于分类KNN分类器。该模型为古吉拉特数为99.26%的平均精度,英语数字为99.20%,总体准确性为99.23%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号