首页> 外文期刊>Procedia Computer Science >Implementation of Optical Character Recognition using Tesseract with the Javanese Script Target in Android Application
【24h】

Implementation of Optical Character Recognition using Tesseract with the Javanese Script Target in Android Application

机译:在Android应用程序中使用带有Javanese脚本目标的Tesseract实现光学字符识别

获取原文
           

摘要

Recognising characters from text have been a popular topic in the computer vision area. The application can benefit to many problems in the world. For example: recognising text in documents, classifying the text or scripts of documents, plate recognition, etc. Many researchers have been developed the methods for recognising characters in by using Optical Character Recognition methods. Although text recognition problem using Optical Character Recognition has been more or less solved, most of the Optical Character Recognition problem explored is belong to Latin alphabet texts. Meanwhile, there are several languages have non-Latin scripts as the written text. Recognising a non-Latin script is quite challenging as the contour and shape of the text are relatively different with a Latin script text. This research aims to collect datasets for OCR in Javanese characters. A total of 5880 characters were collected and trained with several methods with Tesseract OCR tools. The models then be implemented to a mobile phone (Android based). The highest accuracy (97,50%) achieved by the model was achieved by combining single boundary box for the whole parts of the character and the separate boundary boxes in main body andsandanganparts.
机译:从文本识别字符已成为计算机视觉领域的热门话题。该应用程序可以使世界上的许多问题受益。例如:识别文档中的文本,对文档中的文本或脚本进行分类,印版识别等。许多研究人员已经开发出使用光学字符识别方法来识别字符的方法。尽管已经或多或少地解决了使用光学字符识别的文本识别问题,但是探索的大多数光学字符识别问题都属于拉丁字母文本。同时,有几种语言具有非拉丁文字作为书面文字。识别非拉丁文字非常具有挑战性,因为文本的轮廓和形状与拉丁文字相对不同。这项研究旨在收集Javanese字符中的OCR数据集。总共收集了5880个字符,并使用Tesseract OCR工具使用多种方法对其进行了培训。然后将模型实现到手机(基于Android)上。通过将字符的整个部分的单个边界框与主体和三当当部分的单独的边界框组合在一起,该模型实现了最高的准确性(97,50%)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号