首页> 外文会议>International symposium on visual computing >OCR as a Service: An Experimental Evaluation of Google Docs OCR, Tesseract, ABBYY FineReader, and Transym

【24h】

OCR as a Service: An Experimental Evaluation of Google Docs OCR, Tesseract, ABBYY FineReader, and Transym

机译：OCR即服务：对Google文档OCR，Tesseract，ABBYY FineReader和Transym的实验评估

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Optical character recognition (OCR) as a classic machine learning challenge has been a longstanding topic in a variety of applications in healthcare, education, insurance, and legal industries to convert different types of electronic documents, such as scanned documents, digital images, and PDF files into fully editable and searchable text data. The rapid generation of digital images on a daily basis prioritizes OCR as an imperative and foundational tool for data analysis. With the help of OCR systems, we have been able to save a reasonable amount of effort in creating, processing, and saving electronic documents, adapting them to different purposes. A set of different OCR platforms are now available which, aside from lending theoretical contributions to other practical fields, have demonstrated successful applications in real-world problems. In this work, several qualitative and quantitative experimental evaluations have been performed using four well-know OCR services, including Google Docs OCR, Tesseract, ABBYY FineReader, and Transym. We analyze the accuracy and reliability of the OCR packages employing a dataset including 1227 images from 15 different categories. Furthermore, we review the state-of-the-art OCR applications in healtcare informatics. The present evaluation is expected to advance OCR research, providing new insights and consideration to the research area, and assist researchers to determine which service is ideal for optical character recognition in an accurate and efficient manner.

机译：光学字符识别（OCR）作为经典的机器学习挑战已成为医疗保健，教育，保险和法律行业中各种应用程序中转换不同类型的电子文档（如扫描文档，数字图像和PDF）的长期问题。文件转换为完全可编辑和可搜索的文本数据。每天快速生成数字图像使OCR成为数据分析的必不可少的基础工具。借助OCR系统，我们已经能够在创建，处理和保存电子文档方面进行合理的工作量调整，以使其适应不同的目的。现在提供了一组不同的OCR平台，这些平台除了为其他实际领域提供理论上的贡献外，还展示了在实际问题中的成功应用。在这项工作中，已经使用四个众所周知的OCR服务进行了一些定性和定量的实验评估，包括Google Docs OCR，Tesseract，ABBYY FineReader和Transym。我们使用包含15个不同类别的1227张图像的数据集来分析OCR软件包的准确性和可靠性。此外，我们回顾了医疗保健信息学中最先进的OCR应用程序。当前的评估有望促进OCR研究，为研究领域提供新的见解和考虑，并帮助研究人员确定哪种服务最适合以准确有效的方式进行光学字符识别。

著录项

来源
《International symposium on visual computing》|2016年|735-746|共12页
会议地点
作者
Ahmad P. Tafti; Ahmadreza Baghaie; Mehdi Assefi; Hamid R. Arabnia; Zeyun Yu; Peggy Peissig;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. OCR: FineReader 10 Abbyy Software [J] . Jean-Jacques Maleval MOS: Le Magazine Du Stockage Et De La Gestion D'Informations . 2010,第261期

机译：OCR：FineReader 10 Abbyy软件
2. OCR: les nouvelles fonctions de FineReader 9.0 d'Abbyy Software [J] . Jean-Jacques Maleval MOS: Le Magazine Du Stockage Et De La Gestion D'Informations . 2008,第245期

机译：OCR：Abbyy Software的FineReader 9.0的新功能
3. An experimental evaluation of OCR text representations for learning document classifiers [J] . Markus Junker, Rainer Hoch International Journal on Document Analysis and Recognition . 1998,第2期

机译：用于学习文档分类器的OCR文本表示的实验评估
4. OCR as a Service: An Experimental Evaluation of Google Docs OCR, Tesseract, ABBYY FineReader, and Transym [C] . Ahmad P. Tafti, Ahmadreza Baghaie, Mehdi Assefi, International Symposium on Visual Computing . 2016

机译：OCR作为服务：Google Docs OCR，TESSEACT，ABBYY FINEREADER和转族的实验评估
5. A hybrid two-dimensional HMM and MLP OCR system for processing multi-font and low-quality English documents. [D] . Fu, Nenghong. 2004

机译：混合的二维HMM和MLP OCR系统，用于处理多字体和低质量的英语文档。
6. Towards Mobile OCR: How To Take a Good Picture of a Document Without Sight [O] . Michael Cutter, Roberto Manduchi -1

机译：迈向移动OCR：如何在无视的情况下对文档进行良好的拍摄
7. OCR with Tesseract, Amazon Textract, and Google Document AI: A Benchmarking Experiment [O] . Thomas Hegghammer 2021

机译：OCR与Tesseract，Amazon Textract和Google Document AI：基准测试

OCR as a Service: An Experimental Evaluation of Google Docs OCR, Tesseract, ABBYY FineReader, and Transym

摘要

著录项

相似文献

相关主题

期刊订阅