A hybrid approach for ontology-based information extraction

机译：基于本体的信息提取的混合方法

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Information extraction (IE) is the process of automatically transforming written natural language (i.e., text) into structured information, such as a knowledge base. However, because natural language is inherently ambiguous, this transformation process is highly complex. On the other hand, as Information Extraction moves from the analysis of scientific documents to the analysis of Internet textual content, we cannot rely completely on the assumption that the content of the text is correct. Indeed, in contrast to scientific documents, which are peer reviewed, Internet content is not verified for the quality and correctness.;Thus, two main issues that affect the IE process are the complexity of the extraction process and the quality of the data.;In this dissertation, we propose an improved ontology-based IE (OBIE) by providing solutions to these issues of accuracy and content quality. Based on a hybrid strategy that combines aspects of IE that are usually considered as opposite to each other, or that are not even considered, we intend to improve IE by developing a more accurate extraction and new functionality (semantic error detection). Our approach is based on OBIE, a sub-area of IE, which reduces extraction complexity by including domain knowledge, in the form of concepts and relationships of the domain, to guide the extraction process.;We address the complexity of extraction by combining information extractors that have different implementations. By integrating different types of implementation into one extraction system, we can produce a more accurate extraction. For each concept or relationship in the ontology, we can select the best implementation for extraction, or we can combine both implementations under an ensemble learning schema. In tandem, we address the quality of information by determining its semantic correctness with regard to domain knowledge. We define two methods for semantic error detection: by predefining the types of errors expected in the text or by applying logic reasoning to the text.;This dissertation includes both published and unpublished coauthored material.

机译：信息提取（IE）是将书面自然语言（即文本）自动转换为结构化信息（例如知识库）的过程。但是，由于自然语言固有地模棱两可，因此此转换过程非常复杂。另一方面，随着信息提取从对科学文献的分析转向对互联网文本内容的分析，我们不能完全依赖于文本内容正确的假设。确实，与经过同行评审的科学文献相比，互联网内容的质量和正确性并未得到验证。因此，影响IE过程的两个主要问题是提取过程的复杂性和数据质量。本文通过提供针对这些准确性和内容质量问题的解决方案，提出了一种改进的基于本体的IE（OBIE）。基于一种混合策略，该策略结合了通常被认为彼此相对或什至没有考虑到的IE方面，我们打算通过开发更准确的提取和新功能（语义错误检测）来改进IE。我们的方法基于IE的子区域OBIE，它通过以域的概念和关系的形式包含域知识来降低提取复杂性，以指导提取过程。;我们通过结合信息来解决提取的复杂性具有不同实现的提取器。通过将不同类型的实现集成到一个提取系统中，我们可以进行更准确的提取。对于本体中的每个概念或关系，我们可以选择最佳的实现方式进行提取，也可以在整体学习模式下将两种实现方式组合在一起。同时，我们通过确定关于领域知识的语义正确性来解决信息的质量。我们定义了两种语义错误检测方法：通过预定义文本中预期的错误类型或对文本应用逻辑推理。本论文包括已发表和未发表的合著材料。

著录项

作者
Gutierrez, Fernando.;
展开▼
作者单位

University of Oregon.;

展开▼
授予单位 University of Oregon.;
学科 Computer science.;Artificial intelligence.
学位 Ph.D.
年度 2015
页码 132 p.
总页数 132
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. A hybrid ontology-based information extraction system [J] . Fernando Gutierrez, Dejing Dou, Stephen Fickas, Journal of Information Science . 2016,第6期

机译：基于混合本体的信息提取系统
2. AN ONTOLOGY-BASED SEMANTIC EXTRACTION APPROACH FROM TEXT CORPUS [J] . KITTIPHONG SENGLOILUEAN, NGAMNIJ ARCH-INT, SOMJIT ARCH-INT Journal of Theoretical and Applied Information Technology . 2020,第14期

机译：文本语料库的基于本体的语义提取方法
3. Ontology-based approach to enhance medical web information extraction [J] . Nassim Abdeldjallal Otmani, Malik Si-Mohammed, Catherine Comparot, International journal of web information systems . 2019,第3期

机译：基于本体的方法来增强医学网络信息提取
4. An Ontology-Based Information Extraction System for bridging the configuration gap in hybrid SDN environments [C] . Martinez A., Yannuzzi M., Lopez de Vergara J.E., IFIP/IEEE International Symposium on Integrated Network Management . 2015

机译：基于本体的信息抽取系统，用于弥合混合SDN环境中的配置差距
5. Applying persona concept and ontology-based approach to support the requirements engineering process [D] . Sim, Wee Wee 2015

机译：应用角色概念和基于本体的方法来支持需求工程流程
6. Lead extraction and upgrade to a biventricular device with concomitant systemic tricuspid valve replacement in an adult with congenitally corrected transposition: A hybrid approach [O] . Tahmeed Contractor, Ahmed Kheiwa, Ravi Mandapati, 2020

机译：具有先天校正转置的成年人伴随的全身三尖瓣置换的前列升级和升级到双心房装置：混合方法
7. A Hybrid Approach for Ontology-based Information Extraction [O] . Gutierrez Fernando 2016

机译：基于本体的信息提取的混合方法
8. Automated Extraction and Characterisation of Social Network Data from Unstructured Sources -- An Ontology-Based Approach. [R] . Martineau, E., Lecocq, R. 2013

机译：非结构化源社交网络数据的自动提取与表征 - 基于本体论的方法。

A hybrid approach for ontology-based information extraction

摘要

著录项

相似文献

相关主题

期刊订阅