Identifying user intent through query refinements

Xiaojuan ZHANG; Wei LU

摘要

Purpose:In this paper,we attempt to use query refinements to identify users' search intents and seek a method for intent clustering based on real world query data.Design/methodology/approach:An experiment has been conducted to analyze selected search sessions from the American Online (AOL) query logs with a two-stage approach.The first stage is to identify underlying intent by combining query co-occurrence information with query expression similarity.The work in the second stage is to cluster identified results by constructing query vectors through performing random walks on a Markov graph.Findings:Average correctness for identifying search intent is 0.74.Precision,recall,F-score values for intent clustering are 0.73,0.72 and 0.71,respectively.The results indicate that combining session co-occurrence information and query expression similarity can further filter noises and our clustering method is more suitable for sparse data.Research limitations:We use the time-out threshold (15-minute) method to group queries in one session,but a user may have multiple search goals at the same time and the multi-task behavior of a user is hard to capture in a session defined based on time notions.Practical implications:This study provides insights into the ways of understanding users' search intents by analyzing their queries and refinements from a new perspective.The results will help search engine developers to identify user intents.Originality/value:We propose a new method to identify users' search intents by combining session co-occurrence information and query expression similarity,and a new method for clustering sparse data.

机译：目的：本文试图通过查询优化来识别用户的搜索意图，并寻求一种基于现实世界查询数据的意图聚类方法。设计/方法/方法：已进行了一项实验，以分析从美国在线（AOL）查询日志采用两阶段方法，第一阶段是通过将查询共现信息与查询表达相似度相结合来识别潜在意图，第二阶段是通过构建查询向量来聚类识别结果结果：识别意图聚类的平均正确度为0.74。意图聚类的精确度，召回度和F得分分别为0.73、0.72和0.71。结果表明，结合会话共现信息和查询表达式的相似性可以进一步过滤噪声，我们的聚类方法更适合于稀疏数据。研究局限性：我们使用超时阈值（15分钟te）在一个会话中对查询进行分组的方法，但是用户可能同时具有多个搜索目标，并且很难在基于时间概念定义的会话中捕获用户的多任务行为。实际意义：本研究提供了通过从新的角度分析用户的查询和提炼来深入了解用户的搜索意图的方式。结果将帮助搜索引擎开发人员识别用户的意图。来源/价值：我们提出了一种通过结合使用来识别用户搜索意图的新方法会话共现信息和查询表达式相似性，以及一种稀疏数据聚类的新方法。

Identifying user intent through query refinements

摘要

著录项

相关主题

期刊订阅