首页> 外文期刊>Image and Vision Computing >Segmentation of historical machine-printed documents using Adaptive Run Length Smoothing and skeleton segmentation paths
【24h】

Segmentation of historical machine-printed documents using Adaptive Run Length Smoothing and skeleton segmentation paths

机译:使用自适应游程平滑和骨架分割路径对历史机器打印文档进行分割

获取原文
获取原文并翻译 | 示例

摘要

In this paper, we strive towards the development of efficient techniques in order to segment document pages resulting from the digitization of historical machine-printed sources. This kind of documents often suffer from low quality and local skew, several degradations due to the old printing matrix quality or ink diffusion, and exhibit complex and dense layout. To face these problems, we introduce the following innovative aspects: (ⅰ) use of a novel Adaptive Run Length Smoothing Algorithm (ARLSA) in order to face the problem of complex and dense document layout, (ⅱ) detection of noisy areas and punctuation marks that are usual in historical machine-printed documents, (ⅲ) detection of possible obstacles formed from background areas in order to separate neighboring text columns or text lines, and (ⅳ) use of skeleton segmentation paths in order to isolate possible connected characters. Comparative experiments using several historical machine-printed documents prove the efficiency of the proposed technique.
机译:在本文中,我们致力于开发高效的技术,以便对历史机器打印源的数字化所产生的文档页面进行分段。这类文档通常质量低劣,局部偏斜,由于旧的打印基体质量或墨水扩散而导致的一些退化,并且显示出复杂而密集的布局。为了解决这些问题,我们介绍了以下创新方面:(ⅰ)使用新颖的自适应行程长度平滑算法(ARLSA)来解决复杂而密集的文档布局问题;(ⅱ)检测嘈杂的区域和标点符号这些都是历史机器打印文档中常见的操作;(ⅲ)检测由背景区域形成的障碍物以分隔相邻的文本列或文本行;(ⅳ)使用骨架分割路径以隔离可能的连接字符。使用多个历史机器打印文档进行的比较实验证明了该技术的有效性。

著录项

  • 来源
    《Image and Vision Computing》 |2010年第4期|590-604|共15页
  • 作者单位

    Department of Electrical and Computer Engineering, Democritus University of Thrace, 67 100 Xanthi, Greece Computational Intelligence Laboratory, Institute of Informatics and Telecommunications, National Center for Scientific Research 'Demokritos', 153 10 Athens, Greece;

    Department of Electrical and Computer Engineering, Democritus University of Thrace, 67 100 Xanthi, Greece;

    Computational Intelligence Laboratory, Institute of Informatics and Telecommunications, National Center for Scientific Research 'Demokritos', 153 10 Athens, Greece;

    Computational Intelligence Laboratory, Institute of Informatics and Telecommunications, National Center for Scientific Research 'Demokritos', 153 10 Athens, Greece;

    Department of Electrical and Computer Engineering, Democritus University of Thrace, 67 100 Xanthi, Greece;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    text line segmentation; word segmentation; character segmentation; historical machine-printed documents; run length smoothing algorithm;

    机译:文本行分割;分词字符分割;历史机印文件;游程平滑算法;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号