...
首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >CHINESE DOCUMENT LAYOUT ANALYSIS BASED ON ADAPTIVE SPLIT-AND-MERGE AND QUALITATIVE SPATIAL REASONING
【24h】

CHINESE DOCUMENT LAYOUT ANALYSIS BASED ON ADAPTIVE SPLIT-AND-MERGE AND QUALITATIVE SPATIAL REASONING

机译:基于自适应拼合和定性空间推理的中文文档布局分析

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

The ultimate goal of automatic document processing is to understand the semantics of a document. Towards such an end, one of the primary enabling steps has been to first reason about the layout of the document by means of page segmentation and segment spatial reasoning or labeling. This, in turn, allows for the derivation of document logical organization. This paper describes a generic document segmentation and geometric relation labeling method with applications to Chinese document analysis. Unlike the previous document segmentation methods where text spacing, border lines, and/or a priori layout models based on template matching processing are performed, the present method begins with a hierarchy of partitioned image layers where inhomogeneous higher-level regions are recursively partitioned into lower-level rectangular subregions and at the same time lower-level smaller homogeneous regions are merged into larger homogeneous regions. Furthermore, the derived segment data structure readily enables efficient search for geometric relationships between identified document segments. (C) 1997 pattern Recognition Society. Published by Elsevier Science Ltd. [References: 33]
机译:自动文档处理的最终目标是了解文档的语义。为此,主要的使能步骤之一是首先借助于页面分割和片段空间推理或标记来对文档的布局进行推理。反过来,这允许派生文档逻辑组织。本文介绍了一种通用的文档分割和几何关系标注方法及其在中文文档分析中的应用。与先前的文档分割方法不同,在先前的文档分割方法中,基于模板匹配处理执行文本间距,边界线和/或先验布局模型,本方法以分区图像层的层次结构开始,在该层次上,将不均匀的较高层区域递归地划分为较低的区域级别的矩形子区域以及较低级别的较小的同质区域合并为较大的同质区域。此外,所得到的片段数据结构容易地使得能够有效地搜索所识别的文档片段之间的几何关系。 (C)1997模式识别学会。由Elsevier Science Ltd.发布[参考:33]

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号