【24h】

Answering Relationship Queries on the Web

机译:在网络上回答关系查询

获取原文
获取原文并翻译 | 示例

摘要

Finding relationships between entities on the Web, e.g., the connections between different places or the commonalities of people, is a novel and challenging problem. Existing Web search engines excel in keyword matching and document ranking, but they cannot well handle many relationship queries. This paper proposes a new method for answering relationship queries on two entities. Our method first respectively retrieves the top Web pages for either entity from a Web search engine. It then matches these Web pages and generates an ordered list of Web page pairs. Each Web page pair consists of one Web page for either entity. The top ranked Web page pairs are likely to contain the relationships between the two entities. One main challenge in the ranking process is to effectively filter out the large amount of noise in the Web pages without losing much useful information. To achieve this, our method assigns appropriate weights to terms in Web pages and intelligently identifies the potential connecting terms that capture the relationships between the two entities. Only those top potential connecting terms with large weights are used to rank Web page pairs. Finally, the top ranked Web page pairs are presented to the searcher. For each such pair, the query terms and the top potential connecting terms are properly highlighted so that the relationships between the two entities can be easily identified. We implemented a prototype on top of the Google search engine and evaluated it under a wide variety of query scenarios. The experimental results show that our method is effective at finding important relationships with low overhead.
机译:在网络上找到实体之间的关系,例如,不同地方之间的联系或人们的共同性,是一个新颖而具有挑战性的问题。现有的Web搜索引擎在关键字匹配和文档排名方面很出色,但是它们不能很好地处理许多关系查询。本文提出了一种新的方法来回答两个实体上的关系查询。我们的方法首先从网络搜索引擎分别检索任一实体的顶部网页。然后,它匹配这些网页并生成网页对的有序列表。每个网页对由一个实体的一个网页组成。排名最高的网页对可能包含两个实体之间的关系。排名过程中的主要挑战之一是如何有效地过滤掉网页中的大量噪声,而又不会丢失很多有用的信息。为实现此目的,我们的方法为网页中的术语分配了适当的权重,并智能地标识了捕获两个实体之间关系的潜在连接术语。只有那些权重最大的潜在连接词才可以对网页对进行排名。最后,排名靠前的网页对将显示给搜索者。对于每个这样的对,将正确突出显示查询词和最有可能的潜在连接词,以便可以轻松识别两个实体之间的关系。我们在Google搜索引擎上实现了一个原型,并在各种查询场景下对其进行了评估。实验结果表明,我们的方法可以有效地以低开销找到重要的关系。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号