【24h】

A Web-Based Novel Term Similarity Framework for Ontology Learning

机译:基于网络的本体学习新术语相似度框架

获取原文
获取原文并翻译 | 示例

摘要

Given that pairwise similarity computations are essential in ontology learning and data mining, we propose a similarity framework that is based on a conventional Web search engine. There are two main aspects that we can benefit from utilizing a Web search engine. First, we can obtain the freshest content for each term that represents the up-to-date knowledge on the term. This is particularly useful for dynamic ontology management in that ontologies must evolve with time as new concepts or terms appear. Second, in comparison with the approaches that use the certain amount of crawled Web documents as corpus, our method is less sensitive to the problem of data sparseness because we access as much content as possible using a search engine. At the core of our proposed methodology, we present two different measures for similarity computation, a mutual information based and a feature-based metric. Moreover, we show how the proposed metrics can be utilized for modifying existing ontologies. Finally, we compare the extracted similarity relations with semantic similarity using WordNet. Experimental results show that our method can extract topical relations between terms that are not present in conventional concept-based ontologies.
机译:鉴于成对相似度计算对于本体学习和数据挖掘至关重要,我们提出了一种基于常规Web搜索引擎的相似度框架。利用Web搜索引擎可以使我们从两个方面受益。首先,我们可以获取代表该术语最新知识的每个术语的最新内容。这对于动态本体管理特别有用,因为随着新概念或术语的出现,本体必须随时间发展。第二,与使用一定数量的已爬网Web文档作为语料库的方法相比,我们的方法对数据稀疏问题不太敏感,因为我们使用搜索引擎访问了尽可能多的内容。在我们提出的方法的核心中,我们提出了两种不同的相似度计算方法,即基于互信息和基于特征的度量。此外,我们展示了如何将建议的度量标准用于修改现有本体。最后,我们使用WordNet将提取的相似关系与语义相似进行比较。实验结果表明,我们的方法可以提取传统的基于概念的本体中不存在的术语之间的主题关系。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号