...
首页> 外文期刊>Information retrieval >Entity ranking in Wikipedia: utilising categories,links and topic difficulty prediction
【24h】

Entity ranking in Wikipedia: utilising categories,links and topic difficulty prediction

机译:维基百科中的实体排名:利用类别,链接和主题难度预测

获取原文
获取原文并翻译 | 示例
           

摘要

Entity ranking has recently emerged as a research field that aims at retrieving entities as answers to a query. Unlike entity extraction where the goal is to tag names of entities in documents, entity ranking is primarily focused on returning a ranked list of relevant entity names for the query. Many approaches to entity ranking have been proposed, and most of them were evaluated on the INEX Wikipedia test collection. In this paper, we describe a system we developed for ranking Wikipedia entities in answer to a query. The entity ranking approach implemented in our system utilises the known categories, the link structure of Wikipedia, as well as the link co-occurrences with the entity examples (when provided) to retrieve relevant entities as answers to the query. We also extend our entity ranking approach by utilising the knowledge of predicted classes of topic difficulty. To predict the topic difficulty, we generate a classifier that uses features extracted from an INEX topic definition to classify the topic into an experimentally predetermined class. This knowledge is then utilised to dynamically set the optimal values for the retrieval parameters of our entity ranking system. Our experiments demonstrate that the use of categories and the link structure of Wikipedia can significantly improve entityrnranking effectiveness, and that topic difficulty prediction is a promising approach that could also be exploited to further improve the entity ranking performance.
机译:实体排名最近已成为研究领域,旨在检索实体作为查询的答案。与实体提取(其目标是标记文档中的实体名称)不同,实体排名主要集中在为查询返回相关实体名称的排名列表。已经提出了许多用于实体排名的方法,并且大多数方法是在INEX Wikipedia测试集中进行评估的。在本文中,我们描述了一个用于对维基百科实体进行排名以回答查询的系统。在我们的系统中实施的实体排名方法利用已知的类别,Wikipedia的链接结构以及与实体示例(如果提供)一起出现的链接,以检索相关实体作为查询的答案。我们还利用主题难度的预测类别知识扩展了实体排名方法。为了预测主题难度,我们生成了一个分类器,该分类器使用从INEX主题定义中提取的特征将主题分类为实验性预定的类。然后,利用这些知识为我们的实体排名系统的检索参数动态设置最佳值。我们的实验表明,使用类别和Wikipedia的链接结构可以显着提高实体排名效果,并且主题难度预测是一种很有前途的方法,也可以用来进一步改善实体排名性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号