首页> 外文会议>International Conference on Document Analysis and Recognition >Labeling, Cutting, Grouping: An Efficient Text Line Segmentation Method for Medieval Manuscripts
【24h】

Labeling, Cutting, Grouping: An Efficient Text Line Segmentation Method for Medieval Manuscripts

机译:标记,切割,分组:中世纪手稿的有效文本行分割方法

获取原文

摘要

This paper introduces a new way for text-line extraction by integrating deep-learning based pre-classification and state-of-the-art segmentation methods. Text-line extraction in complex handwritten documents poses a significant challenge, even to the most modern computer vision algorithms. Historical manuscripts are a particularly hard class of documents as they present several forms of noise, such as degradation, bleed-through, interlinear glosses, and elaborated scripts. In this work, we propose a novel method which uses semantic segmentation at pixel level as intermediate task, followed by a text-line extraction step. We measured the performance of our method on a recent dataset of challenging medieval manuscripts and surpassed state-of-the-art results by reducing the error by 80.7%. Furthermore, we demonstrate the effectiveness of our approach on various other datasets written in different scripts. Hence, our contribution is two-fold. First, we demonstrate that semantic pixel segmentation can be used as strong denoising pre-processing step before performing text line extraction. Second, we introduce a novel, simple and robust algorithm that leverages the high-quality semantic segmentation to achieve a text-line extraction performance of 99.42% line IU on a challenging dataset.
机译:本文结合了基于深度学习的预分类和最新的分割方法,为文本行提取提供了一种新方法。即使对于最现代的计算机视觉算法,复杂手写文档中的文本行提取也构成了巨大的挑战。历史手稿是一类特别困难的文件,因为它们会表现出多种形式的噪音,例如降级,渗色,线间光泽和精心制作的手稿。在这项工作中,我们提出了一种新颖的方法,该方法使用像素级的语义分割作为中间任务,然后执行文本行提取步骤。我们在具有挑战性的中世纪手稿的最新数据集上测量了我们的方法的性能,并通过减少80.7%的误差超过了最新的结果。此外,我们展示了我们的方法在以不同脚本编写的其他各种数据集上的有效性。因此,我们的贡献是双重的。首先,我们证明了语义像素分割可以用作执行文本行提取之前的强降噪预处理步骤。其次,我们介绍了一种新颖,简单且健壮的算法,该算法利用高质量的语义分段在具有挑战性的数据集上实现了99.42%的行IU的文本行提取性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号