首页> 外文会议>IEEE International Conference on Information Reuse and Integration >Enhancing Efficiency of Web Search Engines through Ontology Learning from Unstructured Information Sources
【24h】

Enhancing Efficiency of Web Search Engines through Ontology Learning from Unstructured Information Sources

机译:通过非结构化信息源的本体学习提高网络搜索引擎的效率

获取原文

摘要

With the fast growth rate of information availability through the World Wide Web, search engines' ranking become limited to deal with such enormous amount of information. Web search engines should be enriched with methodologies that enable it to understand the content of Web pages, then to align pages to the correct query category that highly match its content. In this paper, a proposed system is introduced to deal with the abundance of information by automatically understand the content of a Web page, and semantically model the ontological concepts that exist inside it. The semantic relations between ontological concepts are automatically given a score or weight based on its influence to the given query. The weighted semantic relations between ontological concepts can be viewed as a signature for the query, the highly similarity of an article to this signature, the more relevant to the query. A new relevancy measure is introduced to semantically re-rank or classify Web pages based on computing the semantic similarity of the weighted intersection ratio between ontological concepts extracted from retrieved Web pages, and ontological concepts that represents the query. Results shows that the proposed system has the highest Pearson correlation coefficient (0.890) to human judgments which outperforms semantic similarity state-of-the-art methods and Web-based methods. The proposed model, was tested to re-rank Web pages according to the semantic relevancy of the query, experiments shows that it has the highest convergence to expert ranking order of Web pages compared to other Web search engines.
机译:通过万维网的信息可用性快速增长,搜索引擎的排名是有限的,以处理这种大量信息。 Web搜索引擎应富有能够使其能够了解网页的内容,然后将页面对齐至正确匹配其内容的正确查询类别。在本文中,引入了一个提出的系统,以通过自动理解网页的内容来处理丰富的信息,以及语义模拟其中内存中存在的本体论概念。本体概念之间的语义关系基于其对给定查询的影响自动给予得分或重量。本体学概念之间的加权语义关系可以被视为查询的签名,这篇签名的文章的高度相似性,与查询越相关。引入新的相关性度量以基于计算从检索到的网页中提取的本体论概念与表示查询的本体学概念之间的加权交叉路口的语义相似性来进行语义重新排名或分类网页。结果表明,该系统具有最高的Pearson相关系数(0.890),对人类判断优于最优异的最新方法和基于Web的方法。该拟议的模型根据查询的语义相关性测试了重新排名网页,实验表明,与其他网络搜索引擎相比,它对网页的专家排名顺序具有最高的收敛性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号