首页> 外文期刊>Neurocomputing >Structure detection and segmentation of documents using 2D stochastic context-free grammars
【24h】

Structure detection and segmentation of documents using 2D stochastic context-free grammars

机译:使用2D随机上下文无关文法对文档进行结构检测和分割

获取原文
获取原文并翻译 | 示例

摘要

In this paper we define a bidimensional extension of stochastic context-free grammars for structure detection and segmentation of images of documents. Two sets of text classification features are used to perform an initial classification of each zone of the page. Then, the document segmentation is obtained as the most likely hypothesis according to a stochastic grammar. We used a dataset of historical marriage license books to validate this approach. We also tested several inference algorithms for probabilistic graphical models and the results showed that the proposed grammatical model outperformed the other methods. Furthermore, grammars also provide the document structure along with its segmentation. (C) 2014 Elsevier B.V. All rights reserved.
机译:在本文中,我们定义了随机无上下文语法的二维扩展,用于结构检测和文档图像分割。两组文本分类功能用于对页面的每个区域执行初始分类。然后,根据随机语法将文档分割作为最可能的假设获得。我们使用了历史结婚证书的数据集来验证这种方法。我们还测试了概率图形模型的几种推理算法,结果表明,所提出的语法模型优于其他方法。此外,语法还提供文档结构及其分段。 (C)2014 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号