首页> 外文会议>Annual conference on Neural Information Processing Systems >b-Bit Minwise Hashing for Estimating Three-Way Similarities
【24h】

b-Bit Minwise Hashing for Estimating Three-Way Similarities

机译:B比特致估计三向异同的散列

获取原文

摘要

Computing1 two-way and multi-way set similarities is a fundamental problem. This study focuses on estimating 3-way resemblance (Jaccard similarity) using b-bit minwise hashing. While traditional minwise hashing methods store each hashed value using 64 bits, b-bit minwise hashing only stores the lowest b bits (where b > 2 for 3-way). The extension to 3-way similarity from the prior work on 2-way similarity is technically non-trivial. We develop the precise estimator which is accurate and very complicated; and we recommend a much simplified estimator suitable for sparse data. Our analysis shows that 6-bit minwise hashing can normally achieve a 10 to 25-fold improvement in the storage space required for a given estimator accuracy of the 3-way resemblance.
机译:Computing1双向和多路设置相似度是​​一个基本问题。本研究侧重于使用B位MINIVES HASHING估计三通相似性(JAccard相似性)。虽然传统的Minows散列方法使用64位存储每个散列值,但B位minwise散列仅存储最低的B位(其中b> 2为3路)。从先前工作的三通相似性延伸到双向相似性在技术上是非微不足道的。我们开发精确估计,准确且非常复杂;我们建议一个适合稀疏数据的简化估算器。我们的分析表明,6位Minwise Hashing通常可以在给定估计准确度所需的存储空间提高10到25倍的改进,该估计的三通相似度的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号