anyOCR: An Open-Source OCR System for Historical Archives

机译：anyOCR：用于历史档案的开源OCR系统

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Currently an intensive amount of research is going on in the field of digitizing historical archives for converting scanned document images into searchable full text. This paper presents the "anyOCR" system which mainly emphasize the techniques requires for digitizing a historical archive with high accuracy. It is an open-source system for the research community who can easily apply the anyOCR system for digitizing historical archives. The anyOCR system supports a complete document processing pipeline, which includes layout analysis, training OCR models and text line prediction, with an addition of intelligent and interactive layout and OCR error corrections web applications. The anyOCR system can also be used for contemporary document images containing diverse, simple to complex, layouts. This paper describes the current state of the anyOCR system, its architecture, as well as its major features. This paper also provides information about the availability, documentation, and tutorials of the anyOCR system.

机译：当前，在对历史档案进行数字化以将扫描的文档图像转换为可搜索的全文方面进行了大量研究。本文介绍了“ anyOCR”系统，该系统主要强调了将历史档案高精度数字化所需的技术。它是供研究团体使用的开放源代码系统，可以轻松地将anyOCR系统应用于历史档案的数字化。 anyOCR系统支持完整的文档处理管道，包括布局分析，训练OCR模型和文本行预测，以及智能和交互式布局以及OCR错误校正Web应用程序。 anyOCR系统也可以用于包含不同，简单到复杂布局的当代文档图像。本文介绍了anyOCR系统的当前状态，其体系结构及其主要功能。本文还提供了有关anyOCR系统的可用性，文档和教程的信息。

著录项

来源
《IAPR International Conference on Document Analysis and Recognition》|2017年|305-310|共6页
会议地点
作者
Syed Saqib Bukhari; Ahmad Kadi; Mohammad Ayman Jouneh; Fahim Mahmood Mir; Andreas Dengel;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Optical character recognition software; Image segmentation; Pipelines; Layout; Training; Particle separators; Text recognition;

机译：光学字符识别软件;图像分割;管道;布局;训练;颗粒分离器;文本识别;

相似文献

外文文献
中文文献
专利

1. Ocropodium: open source OCR for small-scale historical archives [J] . Tobias Blanke, Michael Bryant, Mark Hedges Journal of Information Science . 2012,第1期

机译：Ocropodium：用于小型历史档案的开源OCR
2. An open-source database model and collections management system for fish scale and otolith archives [J] . Tray Elizabeth, Leadbetter Adam, Meaney Will, Ecological informatics: an international journal on ecoinformatics and computational ecology . 2020,第1期

机译：用于鱼鳞和右侧档案的开源数据库模型和集合管理系统
3. A reliable, low-cost picture archiving and communications system for small and medium veterinary practices built using open-source technology. [J] . Bryan Iotti, Alberto Valazza Journal of digital imaging: the official journal of the Society for Computer Applications in Radiology . 2014,第5期

机译：针对使用开放源技术构建的中小型兽医实践的可靠，低成本的图片归档和通信系统。
4. anyOCR: An Open-Source OCR System for Historical Archives [C] . Syed Saqib Bukhari, Ahmad Kadi, Mohammad Ayman Jouneh, IAPR International Conference on Document Analysis and Recognition . 2017

机译：Actocr：历史档案的开源OCR系统
5. Preserving our past for the future: Designing a geographic information system for archiving historical cemetery information [D] . Titus, Christine Ann 2008

机译：保留未来的过去：设计用于归档历史公墓信息的地理信息系统
6. A Reliable Low-Cost Picture Archiving and Communications System for Small and Medium Veterinary Practices Built Using Open-Source Technology [O] . Bryan Iotti, Alberto Valazza 2014

机译：使用开源技术构建的中小型兽医实践的可靠低成本图片存档和通信系统
7. Evaluation of Open-source Software for Participatory Digital Archives: Understanding System Requirements for No Gun Ri Digital Archives [O] . Taeyeon Park, Donghee Sinn 2016

机译：对参与式数字档案的开源软件评估：了解无枪RI数字档案的系统要求

anyOCR: An Open-Source OCR System for Historical Archives

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅