首页> 外文会议>IAPR International Conference on Document Analysis and Recognition >anyOCR: An Open-Source OCR System for Historical Archives
【24h】

anyOCR: An Open-Source OCR System for Historical Archives

机译:anyOCR:用于历史档案的开源OCR系统

获取原文
获取外文期刊封面目录资料

摘要

Currently an intensive amount of research is going on in the field of digitizing historical archives for converting scanned document images into searchable full text. This paper presents the "anyOCR" system which mainly emphasize the techniques requires for digitizing a historical archive with high accuracy. It is an open-source system for the research community who can easily apply the anyOCR system for digitizing historical archives. The anyOCR system supports a complete document processing pipeline, which includes layout analysis, training OCR models and text line prediction, with an addition of intelligent and interactive layout and OCR error corrections web applications. The anyOCR system can also be used for contemporary document images containing diverse, simple to complex, layouts. This paper describes the current state of the anyOCR system, its architecture, as well as its major features. This paper also provides information about the availability, documentation, and tutorials of the anyOCR system.
机译:当前,在对历史档案进行数字化以将扫描的文档图像转换为可搜索的全文方面进行了大量研究。本文介绍了“ anyOCR”系统,该系统主要强调了将历史档案高精度数字化所需的技术。它是供研究团体使用的开放源代码系统,可以轻松地将anyOCR系统应用于历史档案的数字化。 anyOCR系统支持完整的文档处理管道,包括布局分析,训练OCR模型和文本行预测,以及智能和交互式布局以及OCR错误校正Web应用程序。 anyOCR系统也可以用于包含不同,简单到复杂布局的当代文档图像。本文介绍了anyOCR系统的当前状态,其体系结构及其主要功能。本文还提供了有关anyOCR系统的可用性,文档和教程的信息。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号