首页> 外文学位 >Exploring privacy and personalization in information retrieval applications.
【24h】

Exploring privacy and personalization in information retrieval applications.

机译:在信息检索应用程序中探索隐私和个性化。

获取原文
获取原文并翻译 | 示例

摘要

A growing number of information retrieval applications rely on search behavior aggregated over many users. If aggregated data such as search query reformulations is not handled properly, it can allow users to be identified and their privacy compromised. Besides leveraging aggregate data, it is also common for applications to make use of user-specific behavior in order to provide a personalized experience for users. Unlike aggregate data, privacy is not an issue in individual personalization since users are the only consumers of their own data. The goal of this work is to explore the effects of personalization and privacy preservation methods on three information retrieval applications, namely search task identification, task-aware query recommendation, and searcher frustration detection. We pursue this goal by first introducing a novel framework called CrowdLogging for logging and aggregating data privately over a distributed set of users. We then describe several privacy mechanisms for sanitizing global data, including one novel mechanism based on differential privacy. We present a template for describing how local user data and global aggregate data are collected, processed, and used within an application, and apply this template to our three applications. We find that sanitizing feature vectors aggregated across users has a low impact on performance for classification applications (search task identification and searcher frustration detection). However, sanitizing free-text query reformulations is extremely detrimental to performance for the query recommendation application we consider. Personalization is useful to some degree in all the applications we explore when integrated with global information, achieving gains for search task identification, task-aware query recommendation, and searcher frustration detection. Finally we introduce an open source system called CrowdLogger that implements the CrowdLogging framework and also serves as a platform for conducting in-situ user studies of search behavior, prototyping and evaluating information retrieval applications, and collecting labeled data.
机译:越来越多的信息检索应用程序依赖于聚集在许多用户上的搜索行为。如果未正确处理诸如搜索查询重新编制之类的汇总数据,则可以识别用户并损害其隐私。除了利用聚合数据之外,应用程序还经常利用用户特定的行为来为用户提供个性化的体验。与聚合数据不同,隐私不是个人个性化的问题,因为用户是其自身数据的唯一使用者。这项工作的目的是探索个性化和隐私保护方法对三种信息检索应用程序的影响,即搜索任务识别,任务感知查询推荐和搜索者挫败感检测。为了实现这一目标,我们首先引入了一个名为CrowdLogging的新颖框架,用于在分布式用户集中私密记录和聚合数据。然后,我们描述了几种用于清理全局数据的隐私机制,包括一种基于差异隐私的新颖机制。我们提供了一个模板,用于描述如何在应用程序中收集,处理和使用本地用户数据和全局聚合数据,并将此模板应用于我们的三个应用程序。我们发现,对跨用户聚合的特征向量进行清理对​​分类应用程序的性能(搜索任务标识和搜索者挫败感检测)的影响很小。但是,对自由文本查询的重新格式化进行清理对​​我们考虑的查询推荐应用程序的性能极为不利。在与全局信息集成时,个性化在某种程度上对我们探索的所有应用程序都有用,从而获得搜索任务识别,任务感知查询推荐和搜索者挫败感检测的收益。最后,我们介绍了一个名为CrowdLogger的开源系统,该系统实现了CrowdLogging框架,并且还用作进行就地搜索行为的用户研究,原型设计和评估信息检索应用程序以及收集标记数据的平台。

著录项

  • 作者

    Feild, Henry A.;

  • 作者单位

    University of Massachusetts Amherst.;

  • 授予单位 University of Massachusetts Amherst.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2013
  • 页码 196 p.
  • 总页数 196
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

  • 入库时间 2022-08-17 11:41:58

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号