首页> 外文会议>International Conference on Document Analysis and Recognition >Unified Performance Evaluation for OCR Zoning: Calculating Page Segmentation's Score, That Includes Text Zones, Tables and Non-text Objects
【24h】

Unified Performance Evaluation for OCR Zoning: Calculating Page Segmentation's Score, That Includes Text Zones, Tables and Non-text Objects

机译:OCR分区的统一性能评估:计算页面分割的分数,包括文本区域,表格和非文本对象

获取原文

摘要

The optical character recognition (OCR) systems decompose printed pages into a set of text zones, tables and nontext objects, such as pictures and charts. This part of OCR process is known as the page zoning task. In the paper we present the methodology for assessing the page zoning as a whole task. Many authors evaluate the locations of tables, pictures, and text separately. The key advantage of the proposed system is that it naturally combines the evaluation for text and tables locations, and it is resistant to most segmentation's ambiguities. We calculate score for texts and tables, basing on ground-truth character locations. The score for non-text objects locations is based on areas matching. These scores are combined to get the final page score.
机译:光学字符识别(OCR)系统将打印的页面分解为一组文本区域,表格和非文本对象,例如图片和图表。 OCR过程的这一部分称为页面分区任务。在本文中,我们介绍了评估页面分区作为一个整体任务的方法。许多作者分别评估表格,图片和文本的位置。提出的系统的主要优点是它自然地结合了对文本和表格位置的评估,并且可以抵抗大多数细分的歧义。我们根据真实的字符位置计算文本和表格的分数。非文本对象位置的分数基于区域匹配。将这些分数合并以获得最终页面分数。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号