Segmenting User Sessions in Search Engine Query Logs Leveraging Word Embeddings

机译：在搜索引擎查询日志中分段用户会话利用Word Embeddings

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Segmenting user sessions in search engine query logs is important to perceive information needs and assess how they are satisfied, to enhance the quality of search engine rankings, and to better direct content to certain users. Most previous methods use human judgments to inform supervised learning algorithms, and/or use global thresholds on temporal proximity and on simple lexical similarity metrics. This paper proposes a novel unsupervised method that improves the current state-of-art, leveraging additional heuristics and similarity metrics derived from word embeddings. We specifically extend a previous approach based on combining temporal and lexical similarity measurements, integrating semantic similarity components that use pre-trained FastText embeddings. The paper reports on experiments with an AOL query dataset used in previous studies, containing a total of 10,235 queries, with 4,253 sessions, 2.4 queries per session, and 215 unique users. The results attest to the effectiveness of the proposed method, which outperforms a large set of baselines, also corresponding to unsupervised techniques.

机译：搜索引擎查询日志中分段用户会话对于感知信息需求并评估它们的满意度，以提高搜索引擎排名的质量，并更好地直接内容对某些用户进行更好的方式。最先前的方法使用人类判断来通知受监督的学习算法，和/或在时间接近和简单的词汇相似度量上使用全局阈值。本文提出了一种新颖的无监督方法，可提高目前的最先进，利用来自Word Embeddings的额外启发式和相似度指标。我们特别基于组合时间和词法相似度测量来扩展先前的方法，集成了使用预先训练的FastText Embeddings的语义相似性分量。本文报告了先前研究中使用的AOL查询数据集的实验，总共包含10,235个查询，每次会话为4,253个会话，2.4个查询，215个唯一用户。结果证明了所提出的方法的有效性，这始于大量的基线，也对应于无监督的技术。

著录项

来源
《International conference on theory and practice of digital libraries》|2019年|xv 422 p.|共15页
会议地点
作者
Pedro Gomes; Bruno Martins; Luis Cruz;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类电子图书馆、数字图书馆;
关键词
Analysis of search engine query logs; User session detection; String similarity metrics; Word embeddings;

机译：搜索引擎查询日志分析;用户会话检测;字符串相似度量;Word Embeddings;
入库时间 2022-08-20 20:19:26

相似文献

外文文献
中文文献
专利

1. REDUCING REPLICA OF USER QUERY CLUSTER- CONTENT AND SUB-HYPERLINKS IN THE SEARCH ENGINE LOG BASED USER PROFILE [J] . P.SRINIVASAN, K.BATRI Journal of Theoretical and Applied Information Technology . 2013,第3期

机译：减少基于搜索引擎日志的用户配置文件中的用户查询集群内容和超链接的替换
2. REDUCING REPLICA OF USER QUERY CLUSTER- CONTENT AND SUB-HYPERLINKS IN THE SEARCH ENGINE LOG BASED USER PROFILE [J] . P.SRINIVASAN, K.BATRI Journal of Theoretical and Applied Information Technology . 2013,第3期

机译：减少基于搜索引擎日志的用户配置文件中的用户查询集群内容和超链接的替换
3. Finding competitive keywords from query logs to enhance search engine advertising [J] . Qiao Dandan, Zhang Jin, Wei Qiang, Information & Management . 2017,第4期

机译：从查询日志中查找竞争性关键字以增强搜索引擎广告
4. Segmenting User Sessions in Search Engine Query Logs Leveraging Word Embeddings [C] . Pedro Gomes, Bruno Martins, Luis Cruz International conference on theory and practice of digital libraries . 2019

机译：利用词嵌入在搜索引擎查询日志中分割用户会话
5. Leveraging user interaction to improve search experience with difficult and exploratory queries [D] . Kotov, Alexander Sergeyevich. 2011

机译：利用用户交互来改善困难和探索性查询的搜索体验
6. Leveraging User Query Sessions to Improve Searching of Medical Literature [O] . Shiwen Cheng, Vagelis Hristidis, Michael Weiner 2013

机译：利用用户查询会话来改善医学文献的搜索
7. Identifying task-based sessions in search engine query logs [O] . Lucchese, Claudio, Orlando, Salvatore, Perego, Raffaele, 2011

机译：在搜索引擎查询日志中识别基于任务的会话

Segmenting User Sessions in Search Engine Query Logs Leveraging Word Embeddings

摘要

著录项

相似文献

相关主题

期刊订阅