查询扩展技术被广泛地应用于信息检索系统中.为提高专利检索的结果,采用查询扩展方法进行优化,利用相关专利文本训练词向量,并选择与原始查询相似度高的候选词作为查询扩展词,加入原始查询中.提出4种方法运用词向量获取查询扩展词,并提出两种方法进行扩展词相关性排序,改进已有的查询扩展词选择方法.在TREC数据集上的实验显示,将词向量模型进行扩展词选择的方法与传统的TF-IDF扩展词选择方法相融合,可以有效提高查询扩展模型的性能,对于理解用户的查询意图有着很好的促进作用.%Query expansion is wildly used in information retrieval systems.In order to improve patent retrieval results, this paper applies query expansion methods for optimization.After training the word embedding models using rele-vant documents,words for query expansion are selected based on the similarities with original query.This paper pro-poses four methods to select query expansion terms by applying word embedding,and proposes two methods to rank the terms by relevance to the query.These methods are used to improve the existing query expansion methods.The expe-riments conducted on TREC dataset indicate that combining traditional TF-IDF expansion method with the proposed approach can improve the performance of query expansion models,leading to a better understanding of query intent.
展开▼