首页> 外文会议>International Conference on Information Systems and Computer Aided Education >Research and Exploration on the Construction Method of Knowledge Graph of Water Field Based on Text
【24h】

Research and Exploration on the Construction Method of Knowledge Graph of Water Field Based on Text

机译:基于文本的水田知识图施工方法研究与探索

获取原文

摘要

With the development of the Internet age, collected data have become an important source of knowledge. The field of unstructured text contains many named entities, but includes very little detailed information about those entities. However, the Baidu encyclopedia website is a type of semistructured data that in many cases includes a detailed introduction of entities. By combining the advantages of these two kinds of data, we can enrich the knowledge base of a knowledge graph. This paper aims to extract semistructured data consisting of named entities starting from raw text data. On one hand, this paper extracts named entities with the help of the Harbin Institute of Technology model, parses semistructured content about the named entities using the Octopus tool, constructs a local ontology, and merges the ontology using Python's built-in difflib. SequenceMatcher tool and the Deckard similarity algorithm. On the other hand, we create an XPath-based wrapper to extract the attributes and attribute values of named entities from semistructured data. The experimental results show that this approach can extract information related to named entities from the Baidu encyclopedia automatically to supplement the knowledge base of a water domain knowledge graph. This article can also serve as a reference for constructing domain knowledge graphs in other fields.
机译:随着互联网时代的发展,收集的数据已成为知识的重要来源。非结构化文本领域包含许多命名实体,但包括有关这些实体的很少详细信息。但是,百度百科全书网站是一种半系统数据,在许多情况下包括详细介绍实体。通过组合这两种数据的优点,我们可以丰富知识图的知识库。本文旨在提取由从原始文本数据开始的命名实体组成的半结构数据。一方面,本文在哈尔滨理工学院模型的帮助下命名实体,使用章鱼工具解析了关于命名实体的半结构化内容,构建了一个本地本体,并使用Python内置Difflib合并本体。 SequenceMatcher工具和Deckard相似性算法。另一方面,我们创建基于XPath的包装器,以从半系统数据中提取命名实体的属性和属性值。实验结果表明,这种方法可以自动提取与百度百科全书的命名实体相关的信息,以补充水域知识图的知识库。本文还可以作为在其他字段中构建域知识图表的参考。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号