首页> 美国卫生研究院文献>other >An evaluation of multi-probe locality sensitive hashing for computing similarities over web-scale query logs
【2h】

An evaluation of multi-probe locality sensitive hashing for computing similarities over web-scale query logs

机译:对多探针局部敏感哈希的评估用于计算网络级查询日志的相似性

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Many modern applications of AI such as web search, mobile browsing, image processing, and natural language processing rely on finding similar items from a large database of complex objects. Due to the very large scale of data involved (e.g., users’ queries from commercial search engines), computing such near or nearest neighbors is a non-trivial task, as the computational cost grows significantly with the number of items. To address this challenge, we adopt Locality Sensitive Hashing (a.k.a, LSH) methods and evaluate four variants in a distributed computing environment (specifically, Hadoop). We identify several optimizations which improve performance, suitable for deployment in very large scale settings. The experimental results demonstrate our variants of LSH achieve the robust performance with better recall compared with “vanilla” LSH, even when using the same amount of space.
机译:AI的许多现代应用程序,例如Web搜索,移动浏览,图像处理和自然语言处理,都依赖于从大型复杂对象数据库中查找相似项。由于涉及的数据量非常大(例如,用户从商业搜索引擎中查询的数据),因此计算这样的近邻或近邻并不是一件容易的事,因为计算成本会随着商品数量的增加而显着增加。为了应对这一挑战,我们采用了“本地敏感哈希”(又称LSH)方法,并评估了分布式计算环境(特别是Hadoop)中的四个变体。我们确定了几种可以提高性能的优化方法,这些优化方法适合在非常大规模的环境中进行部署。实验结果表明,即使使用相同的空间量,我们的LSH变体也比“香草” LSH具有更强的性能和更好的召回率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号