首页> 外文学位 >Adaptive segmentation of document images.
【24h】

Adaptive segmentation of document images.

机译:文档图像的自适应分割。

获取原文
获取原文并翻译 | 示例

摘要

A method is presented for the efficient segmentation of text lines from scanned images of technical documents. The method has been implemented in the ARXYC (Adaptive Recursive XY Cut) algorithm, which constructs an XY-tree to represent the geometric layout structure of a page image in which the text lines are found as leaf nodes.; Geometric layout analysis is a subcomponent of the Document Image Analysis processing sequence and is typically preceded by scanning a document into a pixel map, preprocessing of the pixel map to reduce noise and remove skew and by thresholding to a binary image, and typically followed by a mapping of the geometric layout to a function representation and recovery of text and graphics from the pixel image.; Technical documents are sufficiently varied in structure to be challenging to segmentation algorithms yet sufficiently regular to be amenable to analysis. The vast store of archived technical documentation attests to the importance of the task.; ARXYC achieves high generality by depending on only a single primary parameter, the resolution-independent gap-ratio-threshold. ARXYC constructs an initial XY-tree in which the desired text lines are over-segmented into many fragments, then dynamically transforms the XY-tree to the target tree employing three elegant operators, cut, glue and flip, while adaptively applying the threshold to the merging of fragments into text lines. ARXYC monitors the dynamic changes in the structure of the XY-tree to avoid the most serious segmentation error, merging two fragments across the gap between columns.; Results are shown for three experiments on an image set of 97 document pages from a variety of technical journals. The first selects a single fixed threshold for a set of documents based on a sample from that set. The second selects a single fixed threshold for a specific image based on intrinsic measures of the onset of column bridging. Finally, ARXYC adaptively applies a varying threshold to each image guided by the dynamic behavior of the XY-tree matching, on average, 98.8% of the ground truth text lines.
机译:提出了一种用于从技术文档的扫描图像中有效分割文本行的方法。该方法已在ARXYC(自适应递归XY剪切)算法中实现,该算法构造了一个XY树来表示页面图像的几何布局结构,在该页面图像中,文本行被视为叶节点。几何布局分析是“文档图像分析”处理序列的子组件,通常先将文档扫描到像素图中,对像素图进行预处理以减少噪声并消除歪斜,然后对二值图像进行阈值处理,通常之后进行几何布局到功能表示的映射,以及从像素图像中恢复文本和图形。技术文档在结构上有足够的变化,以应对分段算法的挑战,但其规则性足以使其易于分析。大量的存档技术文档证明了这项任务的重要性。 ARXYC通过仅依赖于一个主要参数,即与分辨率无关的 gap-ratio-threshold 来获得很高的通用性。 ARXYC构造一个初始XY树,在其中将所需的文本行过度分割成许多片段,然后使用三个优雅的运算符(剪切,胶合和翻转)将XY树动态转换为目标树,同时将阈值自适应地应用于将片段合并为文本行。 ARXYC监视XY树结构的动态变化,以避免最严重的分割错误,将两个片段合并到列之间的间隙中。在来自各种技术期刊的97个文档页面的图像集上显示了三个实验的结果。第一种方法基于文档集中的样本为文档集合选择单个固定阈值。第二种方法基于列桥接开始的内在度量为特定图像选择单个固定阈值。最后,ARXYC自适应地将变化的阈值应用于由XY树匹配的动态行为引导的每个图像,平均而言,地面真实文本行的98.8%。

著录项

  • 作者

    Sylwester, Donald Robert.;

  • 作者单位

    The University of Nebraska - Lincoln.;

  • 授予单位 The University of Nebraska - Lincoln.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2001
  • 页码 123 p.
  • 总页数 123
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号