首页> 外文会议>International conference on database and expert systems applications >Research Paper Search Using a Topic-Based Boolean Query Search and a General Query-Based Ranking Model
【24h】

Research Paper Search Using a Topic-Based Boolean Query Search and a General Query-Based Ranking Model

机译:使用基于主题的布尔查询搜索和基于通用查询的排​​名模型进行研究论文搜索

获取原文

摘要

When conducting a search for research papers, the search should return comprehensive results related to the user's query. In general, a user inputs a Boolean query that reflects the information need, and the search engine ranks the research papers based on the query. However, it is difficult to anticipate all possible terms that authors of relevant papers might have used. Moreover, general query-based ranking methods emphasize how to rank the relevant documents at the top of the results, but require some means of guaranteeing the comprehensiveness of the results. Therefore, two ranking methods that consider the comprehensiveness of relevant papers are proposed. The first uses a topic-based Boolean query search. This search converts every word in the abstract set and query into a topic via topic analysis by Latent Dirichlet Allocation (LDA) and conducts a search at the topic level. The topic assigned to synonyms of a search term is expected to be the same as that assigned to the search term. Each paper is ranked based on the number of times it is matched with each topic-based Boolean query search executed for various LDA parameter settings. The second is a hybrid method that emphasizes better results from our topic-based ranking result and a general query-based ranking result. This method is based on the observation that the paper sets retrieved by our method and by a general ranking method will be different. Through experiments using the NTCIR-1 and -2 datasets, the effectiveness of our topic-based and hybrid methods are demonstrated.
机译:在搜索研究论文时,搜索应返回与用户查询有关的综合结果。通常,用户输入一个反映信息需求的布尔查询,搜索引擎根据该查询对研究论文进行排名。但是,很难预期相关论文的作者可能使用的所有可能的术语。此外,一般的基于查询的排名方法强调如何将相关文档排名在结果的顶部,但是需要一些方法来保证结果的全面性。因此,提出了两种考虑相关论文综合性的排序方法。第一种使用基于主题的布尔查询搜索。该搜索通过潜在狄利克雷分配(LDA)通过主题分析将摘要集中的每个单词和查询转换为主题,并在主题级别进行搜索。分配给搜索词的同义词的主题应该与分配给搜索词的主题相同。基于与针对各种LDA参数设置执行的每个基于主题的布尔查询搜索相匹配的次数,对每篇论文进行排名。第二种是一种混合方法,该方法从基于主题的排名结果和基于常规查询的排名结果中强调更好的结果。该方法基于以下观察结果:通过我们的方法检索的论文集和通过一般排名方法检索的论文集将有所不同。通过使用NTCIR-1和-2数据集进行的实验,证明了我们基于主题的方法和混合方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号