...
首页> 外文期刊>International Journal of Intelligent Enterprise >Real time noisy dataset implementation of optical character identification using CNN
【24h】

Real time noisy dataset implementation of optical character identification using CNN

机译:使用CNN实时嘈杂的数据集光学字符识别的实现

获取原文
获取原文并翻译 | 示例

摘要

Optical character recognition (OCR) is one of the major research problem in real time applications and it is used to recognise all the characters in an image. As English is a universal language, character recognition in English is a challenging task. Deep learning approach is one of the solution for the recognition of optical characters. Aim of this research work is to perform character recognition using convolutional neural network with LeNET architecture. Dataset used in this work is scanned passport dataset for generating all the characters and digits using tesseract. The dataset has training set of 60,795 and testing set of 7,767. Total samples used are 68,562 which is separated by 62 labels. Till now there is no research on predicting all 52 characters and ten digits. The algorithm used in this work is based on deep learning with appropriate some layer which shows significant improvement in accuracy and reduced the error rate. The developed model was experimented with test dataset for prediction and can produce 93.4% accuracy on training, and 86.5% accuracy on the test dataset.
机译:光学字符识别(OCR)是实时应用中的主要研究问题之一,它用于识别图像中的所有字符。由于英语是一种普遍的语言,英语字符识别是一个具有挑战性的任务。深度学习方法是用于识别光学字符的解决方案之一。本研究工作的目的是使用卷积神经网络与Lenet架构进行字符识别。本工作中使用的数据集是扫描Passport数据集,用于使用TESSERACT生成所有字符和数字。 DataSet具有60,795套和7,767套测试集。使用的总样品为68,562,其分隔62个标签。到目前为止,没有关于预测所有52个字符和十位数的研究。本作工作中使用的算法基于深入学习,具有适当的一些层,其精度显着提高并降低了错误率。开发模型进行了测试数据集进行预测,可以在训练中产生93.4%的准确性,测试数据集的准确性为86.5%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号