首页> 外文会议>International Conference on Document Analysis and Recognition >ICDAR 2019 Historical Document Reading Challenge on Large Structured Chinese Family Records
【24h】

ICDAR 2019 Historical Document Reading Challenge on Large Structured Chinese Family Records

机译:ICDAR 2019年历史文件阅读挑战大型结构化中国家庭纪录

获取原文

摘要

In this paper, we present a large historical database of Chinese family records with the aim to develop robust systems for historical document analysis. In this direction, we propose a Historical Document Reading Challenge on Large Chinese Structured Family Records (ICDAR 2019 HDRC-CHINESE). The objective of the competition is to recognize and analyze the layout, and finally detect and recognize the textlines and characters of the large historical document image dataset containing more than 100000 pages. Cascade R-CNN, CRNN, and U-Net based architectures were trained to evaluate the performances in these tasks. Error rate of 0.01 has been recorded for textline recognition (Task1) whereas a Jaccard Index of 99:54% has been recorded for layout analysis (Task2). The graph edit distance based total error ratio of 1:5% has been recorded for complete integrated textline detection and recognition (Task3).
机译:在本文中,我们展示了中国家庭记录的大型历史数据库,旨在为历史文档分析开发强大的系统。在这个方向上,我们向大型中国结构纪录(ICDAR 2019 HDRC-Chines)提出了一项历史文献历史阅读挑战。竞争的目的是识别和分析布局,最后检测和识别包含超过1000页的大型历史文档图像数据集的Textlines和字符。 Cascade R-CNN,CRNN和基于U-Net的架构训练,以评估这些任务中的性能。对于TextLine识别(Task1)记录了0.01的错误率(任务1),而jactrard指数为99:54%,已被记录用于布局分析(Task2)。图表编辑基于距离的总误差比为1:5%,已被记录为完整的集成TeartLine检测和识别(Task3)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号