【24h】

Mixed graph of terms for query expansion

机译:用于查询扩展的混合术语图

获取原文

摘要

It is well known that one way to improve the accuracy of a text retrieval system is to expand the original query with additional knowledge coded through topic-related terms. In the case of an interactive environment, the expansion, which is usually represented as a list of words, is extracted from documents whose relevance is known thanks to the feedback of the user. In this paper we argue that the accuracy of a text retrieval system can be improved if we employ a query expansion method based on a mixed Graph of Terms representation instead of a method based on a simple list of words. The graph, that is composed of a directed and an undirected subgraph, can be automatically extracted from a small set of only relevant documents (namely the user feedback) using a method for term extraction based on the probabilistic Topic Model. The evaluation of the proposed method has been carried out by performing a comparison with two less complex structures: one represented as a set of pairs of words and another that is a simple list of words.
机译:众所周知,提高文本检索系统准确性的一种方法是使用通过主题相关术语编码的其他知识来扩展原始查询。在交互式环境的情况下,通常通过单词列表表示的扩展是从文档中提取出来的,这些文档的相关性由于用户的反馈而已知。在本文中,我们认为,如果使用基于混合术语图表示的查询扩展方法而不是基于简单单词列表的方法,则可以提高文本检索系统的准确性。可以使用基于概率主题模型的术语提取方法,从一小组仅相关文档(即用户反馈)中自动提取由有向子图和无向子图组成的图。通过与两个不太复杂的结构进行比较来对所提出的方法进行评估:一个结构表示为一组成对的单词,另一个表示为简单的单词列表。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号