首页> 外文会议>International conference of the Italian Association for Artificial Intelligence >Document Layout Analysis for Semantic Information Extraction
【24h】

Document Layout Analysis for Semantic Information Extraction

机译:语义信息提取的文档布局分析

获取原文

摘要

Using machines to automatically extract relevant information from unstructured and semi-structured sources has practical significance in todays life and business. In this context, although understanding the meaning of words is important, the process of identifying self-consistent geometric and logical regions of interest-blocks, cells, columns and tables, as well as paragraphs, titles and captions, only to mention a few-is of paramount importance too. This complex process goes under the name of document layout analysis. In this work, we discuss newly designed techniques to solve this problem effectively, by combining both syntactic and semantic document aspects. These techniques described here are at the basis of KnowRex, a comprehensive system for ontology-driven Information Extraction.
机译:使用机器自动从非结构化和半结构化来源中提取相关信息在当今的生活和商业中具有现实意义。在这种情况下,尽管了解单词的含义很重要,但是识别兴趣块,单元格,列和表格以及段落,标题和标题的自洽几何和逻辑区域的过程,仅提及以下几个方面:也是最重要的。这个复杂的过程以文档布局分析的名义进行。在这项工作中,我们讨论了通过结合句法和语义文档方面来有效解决此问题的新设计技术。这里描述的这些技术是基于KnowRex的基础,KnowRex是一个用于本体驱动的信息提取的综合系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号