首页> 外文会议>International conference on future data and security engineering >An Efficient Similarity Search in Large Data Collections with MapReduce
【24h】

An Efficient Similarity Search in Large Data Collections with MapReduce

机译:使用MapReduce在大型数据集中进行有效的相似性搜索

获取原文

摘要

The era of big data has been calling for many innovations on improving similarity search computing. Such unstoppable large amounts of data threaten both processing capacity and performance of existing information systems. Joining the challenges on scalability, we propose an efficient similarity search in large data collections with MapReduce. In addition, we make the best use of the proposed scheme for widespread similarity search cases including pairwise similarity, search by example, range query, and k-Nearest Neighbor query. Moreover, collaborative strategic refinements are utilized to effectively eliminate unnecessary computations and efficiently speed up the whole process. Last but not least, our methods are enhanced by experiments, along with a previous work, on real large datasets, which shows how well these methods are verified.
机译:大数据时代一直要求在改进相似性搜索计算方面进行许多创新。如此不可阻挡的大量数据威胁着现有信息系统的处理能力和性能。为了应对可伸缩性方面的挑战,我们建议使用MapReduce在大数据集合中进行有效的相似性搜索。此外,我们在广泛的相似性搜索案例中充分利用了所提出的方案,包括成对相似性,示例搜索,范围查询和k最近邻查询。此外,利用协作战略改进来有效地消除不必要的计算并有效地加快整个过程。最后但并非最不重要的一点是,通过对实际大型数据集的实验以及先前的工作对我们的方法进行了改进,这表明了这些方法的验证程度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号