...
首页> 外文期刊>Expert systems with applications >Query expansion based on term distribution and DBpedia features
【24h】

Query expansion based on term distribution and DBpedia features

机译:基于术语分发和DBPedia的查询扩展

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Query Expansion (QE) approaches that involve the reformulation of queries by adding new terms to the initial user query, are intended to ameliorate the vocabulary mismatch between the query keywords and the documents? in Information Retrieval Systems (IRS). One big issue in QE is the selection of the right candidate terms for expansion. For this purpose Linked Data can be used, as a valuable resource, for providing additional expansion features such as the values of sub- and super classes of resources. The underlying research question is whether interlinked data and vocabulary items provide features which can be taken into account for query expansion. In this paper, we introduced a new QE approach that aimed at improving IRS by using the well-known distribution based method Bose-Einstein statistics (Bo1) as well as Linked Data from the knowledge base DBpedia using different numbers of expansion terms. We evaluated the effectiveness of each method individually as well as their combinations using two Text REtrieval Conference (TREC) test collections. Our approach has lead to significant improvement in terms of precision, recall, Mean Average Precision (MAP) at rank 10, and normalized Discounted Cumulative Gain (nDCG) at different ranks compared to Pseudo Relevance Feedback (PRF) that we used as a baseline. The results show that the inclusion of semantic annotations clearly improves the retrieval performance over the baseline method.
机译:查询扩展(QE)通过向初始用户查询添加新术语来涉及查询的Querce的方法旨在改善查询关键字和文档之间的词汇错配?在信息检索系统(IRS)中。 QE中的一个大问题是选择正确的扩张术语。对于此目的,可以使用链接数据作为有价值的资源,用于提供额外的扩展功能,例如子和超级资源的值。基础研究问题是互通数据和词汇项目是否提供了可以考虑查询扩展的功能。在本文中,我们介绍了一种新的QE方法,旨在通过使用众所周知的基于分发的方法Bose-Einstein统计(BO1)来改善IRS,以及使用不同数量的扩展术语来自知识库DBPedia的链接数据。我们使用两个文本检索会议(TREC)测试集合单独评估每个方法的效果以及它们的组合。我们的方法导致精度,召回,平均平均精度(地图)在等级10的平均平均精度(地图),与我们用作基线的伪相关反馈(PRF)相比,不同等级的标准化折扣累积增益(NDCG)。结果表明,包含语义注释清楚地提高了基线方法的检索性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号