...
首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >Probabilistic homogeneity for document image segmentation
【24h】

Probabilistic homogeneity for document image segmentation

机译:文档图像分割的概率均匀性

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

In this paper we propose a novel probabilistic framework for document segmentation exploiting human perceptual recognition of text regions from complicated layouts. In particular, we conceptualize text homogeneity as the Gestalt pattern displayed in text regions, characterized by proximately and symmetrically arranged units with similar morphological and texture features. We model this pattern in the local region of a connected component (CC) using an hierarchical formulation, which simulates a random walk-and-check on a graph encoding the neighborhood of the CC. The proposed formulation allows an effective com putation of what we call the probabilistic local text homogeneity (PLTH) using a weighted summation of the weights of the graph, which are derived from a probabilistic description of the homogeneity between neighboring CCs and computed through Bayesian cue integration. The proposed PLTH enables a multi-aspect analysis, where various primitives such as geometrical configuration, morphological features, texture characterization and location priors are integrated in one computational probabilistic model. This enables an effective text and non-text classification of CCs preceding any grouping process, which is currently absent in document segmentation. Experimental results show that our segmentation method based on the proposed PLTH model improves upon the state-of-the-art. (C) 2020 The Authors. Published by Elsevier Ltd.
机译:在本文中,我们提出了一个新的概率框架,用于利用人类对复杂版面中文本区域的感知识别进行文档分割。特别是,我们将文本同质性概念化为文本区域中显示的格式塔模式,其特征是具有相似形态和纹理特征的近似和对称排列的单元。我们使用分层公式在连接组件(CC)的局部区域对这种模式进行建模,该公式模拟了对CC的邻域进行编码的图上的随机行走和检查。提出的公式允许使用图的权重加权和有效计算我们所称的概率局部文本同质性(PLTH),图的权重由相邻CCs之间同质性的概率描述导出,并通过贝叶斯线索积分计算。所提出的PLTH能够进行多方面分析,将几何结构、形态特征、纹理特征和位置先验等各种基本要素集成到一个计算概率模型中。这可以在任何分组过程之前对CCs进行有效的文本和非文本分类,这在文档分割中目前是不存在的。实验结果表明,基于所提出的PLTH模型的分割方法改进了现有的分割方法。(C) 2020年,作者。爱思唯尔有限公司出版。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号