首页> 外文会议>International Conference on Document Analysis and Recognition >Graph Grammar Based Analysis System of Complex Table Form Document
【24h】

Graph Grammar Based Analysis System of Complex Table Form Document

机译:基于曲线表格文档的基于分析系统

获取原文

摘要

Structure analysis of table form document is important because printed documents and also electronical documents only provide geometrical layout and lexical information explicitly. To handle these documents automatically, logical structure information is necessary. In this paper, we first propose a general representation of table form document based on XML, which contains both structure and layout information. Next, we present structure analysis system based on graph grammar which represents document structure knowledge. As the relation between adjacent fields in table form documents become two dimensional, two dimensional notation is necessary to denote structural knowledge. Therefore, we adopt two dimensional graph grammar to denote them. By using grammar notation, we can easily modify and keep consistency of it, as the rules are relatively simple. Another advantage of using grammar notation is that, it can be used for generating documents only from logical structure. Experimental results have shown that the system successfully analyzed several kinds of table forms.
机译:表格表单文档的结构分析很重要,因为印刷文件,电子文件也仅明确提供几何布局和词汇信息。要自动处理这些文档,需要逻辑结构信息。在本文中,我们首先提出了基于XML的表格文档的一般表示,其中包含结构和布局信息。接下来,我们存在基于图语法的结构分析系统,代表文档结构知识。随着表格表单文档中相邻字段之间的关系成为二维,需要二维符号来表示结构知识。因此,我们采用二维图语法来表示它们。通过使用语法表示法,我们可以轻松修改和保持它的一致性,因为规则相对简单。使用语法表示法的另一个优点是,它可以用于仅从逻辑结构生成文档。实验结果表明,该系统成功分析了几种表格形式。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号