首页> 外文期刊>Journal of web semantics: >Discovering and understanding word level user intent in Web search queries
【24h】

Discovering and understanding word level user intent in Web search queries

机译:在Web搜索查询中发现和理解单词级别的用户意图

获取原文
获取原文并翻译 | 示例

摘要

Identifying and interpreting user intent are fundamental to semantic search. In this paper, we investigate the association of intent with individual words of a search query. We propose that words in queries can be classified as either content or intent, where content words represent the central topic of the query, while users add intent words to make their requirements more explicit. We argue that intelligent processing of intent words can be vital to improving the result quality, and in this work we focus on intent word discovery and understanding. Our approach towards intent word detection is motivated by the hypotheses that query intent words satisfy certain distributional properties in large query logs similar to function words in natural language corpora. Following this idea, we first prove the effectiveness of our corpus distributional features, namely, word co-occurrence counts and entropies, towards function word detection for five natural languages. Next, we show that reliable detection of intent words in queries is possible using these same features computed from query logs. To make the distinction between content and intent words more tangible, we additionally provide operational definitions of content and intent words as those words that should match, and those that need not match, respectively, in the text of relevant documents. In addition to a standard evaluation against human annotations, we also provide an alternative validation of our ideas using dickthrough data. Concordance of the two orthogonal evaluation approaches provide further support to our original hypothesis of the existence of two distinct word classes in search queries. Finally, we provide a taxonomy of intent words derived through rigorous manual analysis of large query logs.
机译:识别和解释用户意图是语义搜索的基础。在本文中,我们研究了意图与搜索查询中各个单词的关联。我们建议将查询中的单词分类为内容或意图,其中内容词代表查询的中心主题,而用户添加意图词以使其要求更加明确。我们认为,对意图词的智能处理对于提高结果质量至关重要,在这项工作中,我们将重点放在意图词的发现和理解上。我们进行意图词检测的方法是基于这样的假设:查询意图词在大型查询日志中满足某些分布特性,类似于自然语言语料库中的功能词。遵循这个想法,我们首先证明了我们的语料库分布特征(即单词共现计数和熵)对于五种自然语言的功能单词检测的有效性。接下来,我们表明使用从查询日志中计算出的这些相同功能,可以可靠地检测查询中的意图词。为了使内容和意图词之间的区别更加明确,我们在相关文档的文本中另外提供了内容和意图词的操作定义,因为这些词应该匹配,而不必匹配。除了针对人类注解的标准评估之外,我们还使用细分数据来提供对我们想法的替代验证。两种正交评估方法的一致性为我们在搜索查询中存在两个不同词类的原始假设提供了进一步的支持。最后,我们提供了通过对大型查询日志进行严格的手动分析得出的意图词的分类法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号