Document Layout Analysis for Semantic Information Extraction

机译：语义信息提取的文档布局分析

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Using machines to automatically extract relevant information from unstructured and semi-structured sources has practical significance in todays life and business. In this context, although understanding the meaning of words is important, the process of identifying self-consistent geometric and logical regions of interest-blocks, cells, columns and tables, as well as paragraphs, titles and captions, only to mention a few-is of paramount importance too. This complex process goes under the name of document layout analysis. In this work, we discuss newly designed techniques to solve this problem effectively, by combining both syntactic and semantic document aspects. These techniques described here are at the basis of KnowRex, a comprehensive system for ontology-driven Information Extraction.

机译：使用机器自动从非结构化和半结构化来源中提取相关信息在当今的生活和商业中具有现实意义。在这种情况下，尽管了解单词的含义很重要，但是识别兴趣块，单元格，列和表格以及段落，标题和标题的自洽几何和逻辑区域的过程，仅提及以下几个方面：也是最重要的。这个复杂的过程以文档布局分析的名义进行。在这项工作中，我们讨论了通过结合句法和语义文档方面来有效解决此问题的新设计技术。这里描述的这些技术是基于KnowRex的基础，KnowRex是一个用于本体驱动的信息提取的综合系统。

著录项

来源
《International conference of the Italian Association for Artificial Intelligence》|2017年|269-281|共13页
会议地点
作者
Weronika T. Adrian; Nicola Leone; Marco Manna; Cinzia Marte;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Document Layout Analysis; Information Extraction; Table recognition; Answer Set Programming; Ontologies; Knowledge representation;

机译：文件布局分析;信息提取;表识别;答案集编程;本体;知识表示;
入库时间 2022-08-26 13:48:50

相似文献

外文文献
中文文献
专利

1. Using latent semantic analysis for automated keyword extraction from large document corpora [J] . TU?BA ?NAL SüZEK Turkish Journal of Electrical Engineering and Computer Sciences . 2017,第3期

机译：使用潜在语义分析从大型文档语料库中自动提取关键词
2. Comparison of Latent Semantic Analysis and Probabilistic Latent Semantic Analysis for Documents Clustering [J] . Kuta, Marcin, Kitowski, Computing and informatics . 2015,第3期

机译：文档聚类的潜在语义分析与概率潜在语义分析的比较
3. COMPARISON OF LATENT SEMANTIC ANALYSIS AND PROBABILISTIC LATENT SEMANTIC ANALYSIS FOR DOCUMENTS CLUSTERING [J] . Marcin Kuta, Jacek Kitowski Computing and informatics . 2014,第3期

机译：文档聚类的潜在语义分析和概率潜在语义分析的比较
4. Document Layout Analysis for Semantic Information Extraction [C] . Weronika T. Adrian, Nicola Leone, Marco Manna, International Conference of the Italian Association for Artificial Intelligence . 2017

机译：语义信息提取的文档布局分析
5. Extraction of semantic header frm RTF documents. [D] . Ali, Abdelbaset. 1999

机译：提取frm RTF文档的语义头。
6. A System for Automated Extraction of Metadata from Scanned Documents using Layout Recognition and String Pattern Search Models [O] . Dharitri Misra, Siyuan Chen, George R. Thoma -1

机译：使用布局识别和字符串模式搜索模型从扫描文档中自动提取元数据的系统
7. Machine Learning for Digital Document Processing: From Layout Analysis To Metadata Extraction [O] . Floriana Esposito, Stefano Ferilli, Teresa M. A. Basile, 2010

机译：用于数字文档处理的机器学习：从布局分析到元数据提取

Document Layout Analysis for Semantic Information Extraction

摘要

著录项

相似文献

相关主题

期刊订阅