Purpose:In this paper,we attempt to use query refinements to identify users' search intents and seek a method for intent clustering based on real world query data.Design/methodology/approach:An experiment has been conducted to analyze selected search sessions from the American Online (AOL) query logs with a two-stage approach.The first stage is to identify underlying intent by combining query co-occurrence information with query expression similarity.The work in the second stage is to cluster identified results by constructing query vectors through performing random walks on a Markov graph.Findings:Average correctness for identifying search intent is 0.74.Precision,recall,F-score values for intent clustering are 0.73,0.72 and 0.71,respectively.The results indicate that combining session co-occurrence information and query expression similarity can further filter noises and our clustering method is more suitable for sparse data.Research limitations:We use the time-out threshold (15-minute) method to group queries in one session,but a user may have multiple search goals at the same time and the multi-task behavior of a user is hard to capture in a session defined based on time notions.Practical implications:This study provides insights into the ways of understanding users' search intents by analyzing their queries and refinements from a new perspective.The results will help search engine developers to identify user intents.Originality/value:We propose a new method to identify users' search intents by combining session co-occurrence information and query expression similarity,and a new method for clustering sparse data.
展开▼