首页> 外文期刊>Information retrieval >Mining and ranking users' intents behind queries
【24h】

Mining and ranking users' intents behind queries

机译:在查询之后挖掘和排序用户的意图

获取原文
获取原文并翻译 | 示例
           

摘要

How to understand intents behind user queries is crucial towards improving the performance of Web search systems. NTCIR-11 IMine task focuses on this problem. In this paper, we address the NTCIR-11 IMine task with two phases referred to as Query Intent Mining (QIM) and Query Intent Ranking (QIR). (I) QIM is intended to mine users' potential intents by clustering short text fragments related to the given query. (II) QIR focuses on ranking those mined intents in a proper way. Two challenges exist in handling these tasks. (II) How to precisely estimate the intent similarity between user queries which only consist of a few words. (2) How to properly rank intents in terms of multiple factors, e.g. relevance, diversity, intent drift and so on. For the first challenge, we first investigate two interesting phenomena by analyzing query logs and document datasets, namely "Same-Intent-Co-Click" (SICC) and "Same-Intent-Similar-Rank" (SISR). SICC means that when users issue different queries, these queries represent the same intent if they click on the same URL. SISR means that if two queries denote the same intent, we should get similar search results when issuing them to a search engine. Then, we propose similarity functions for QIM based on the two phenomena. For the second challenge, we propose a novel intent ranking model which considers multiple factors as a whole. We perform extensive experiments and an interesting case study on the Chinese dataset of NTCIR-11 IMine task. Experimental results demonstrate the effectiveness of our proposed approaches in terms of both QIM and QIR.
机译:如何理解用户查询背后的意图对于提高Web搜索系统的性能至关重要。 NTCIR-11 IMine任务专注于此问题。在本文中,我们通过两个阶段(称为查询意图挖掘(QIM)和查询意图排名(QIR))来解决NTCIR-11 IMine任务。 (I)QIM旨在通过聚类与给定查询相关的短文本片段来挖掘用户的潜在意图。 (II)QIR专注于以适当的方式对这些采矿意图进行排名。处理这些任务存在两个挑战。 (II)如何精确地估计仅由几个词组成的用户查询之间的意图相似性。 (2)如何根据多种因素对意图进行适当排名,例如相关性,多样性,意图漂移等。对于第一个挑战,我们首先通过分析查询日志和文档数据集来研究两个有趣的现象,即“相同意图共同点击”(SICC)和“相同意图相似评级”(SISR)。 SICC意味着,当用户发出不同的查询时,如果他们单击相同的URL,则这些查询表示相同的意图。 SISR意味着,如果两个查询表示相同的意图,则在将它们发布给搜索引擎时,我们应该获得相似的搜索结果。然后,基于这两种现象,提出了QIM的相似性函数。对于第二个挑战,我们提出了一个新颖的意图排名模型,该模型将多个因素作为一个整体考虑。我们对NTCIR-11 IMine任务的中文数据集进行了广泛的实验和有趣的案例研究。实验结果证明了我们提出的方法在QIM和QIR方面的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号