...
首页> 外文期刊>Computational intelligence and neuroscience >MapReduce Based Personalized Locality Sensitive Hashing for Similarity Joins on Large Scale Data
【24h】

MapReduce Based Personalized Locality Sensitive Hashing for Similarity Joins on Large Scale Data

机译:基于MapReduce的个性化本地敏感哈希,用于大规模数据上的相似联接

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Locality Sensitive Hashing (LSH) has been proposed as an efficient techniquefor similarity joins for high dimensional data. The efficiency and approximationrate of LSH depend on the number of generated false positive instances and falsenegative instances. In many domains, reducing the number of false positives iscrucial. Furthermore, in some application scenarios, balancing false positives andfalse negatives is favored. To address these problems, in this paper we proposePersonalized Locality Sensitive Hashing (PLSH), where a new banding scheme isembedded to tailor the number of false positives, false negatives, and the sum ofboth. PLSH is implemented in parallel using MapReduce framework to deal withsimilarity joins on large scale data. Experimental studies on real and simulated dataverify the efficiency and effectiveness of our proposed PLSH technique, comparedwith state-of-the-art methods.
机译:局部敏感哈希(LSH)已被提出作为高维数据相似连接的有效技术。 LSH的效率和近似率取决于生成的假阳性实例和假阴性实例的数量。在许多领域,减少误报的数量至关重要。此外,在某些应用场景中,倾向于平衡误报和误报。为了解决这些问题,在本文中,我们提出了个性化的局部敏感哈希(PLSH),其中嵌入了一种新的绑定方案,以调整误报,误报和两者之和的数量。 PLSH是使用MapReduce框架并行实现的,以处理大规模数据上的相似联接。与最新方法相比,对真实和模拟数据进行的实验研究验证了我们提出的PLSH技术的效率和有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号