首页> 外文会议>International conference on networked systems >The Out-of-core KNN Awakens: The Light Side of Computation Force on Large Datasets
【24h】

The Out-of-core KNN Awakens: The Light Side of Computation Force on Large Datasets

机译:核心KNN唤醒:大型数据集上计算力的光明面

获取原文

摘要

K-Nearest Neighbors (KNN) is a crucial tool for many applications, e.g. recommender systems, image classification and web-related applications. However, KNN is a resource greedy operation particularly for large datasets. We focus on the challenge of KNN computation over large datasets on a single commodity PC with limited memory. We propose a novel approach to compute KNN on large datasets by leveraging both disk and main memory efficiently. The main rationale of our approach is to minimize random accesses to disk, maximize sequential accesses to data and efficient usage of only the available memory. We evaluate our approach on large datasets, in terms of performance and memory consumption. The evaluation shows that our approach requires only 7 % of the time needed by an in-memory baseline to compute a KNN graph.
机译:K最近邻居(KNN)是许多应用程序的重要工具,例如推荐系统,图像分类和与Web相关的应用程序。但是,KNN是一种资源贪婪操作,特别是对于大型数据集。我们专注于在内存有限的单台商用PC上对大型数据集进行KNN计算的挑战。我们提出了一种通过有效利用磁盘和主存储器来在大型数据集上计算KNN的新颖方法。我们方法的主要原理是最大程度地减少对磁盘的随机访问,最大化对数据的顺序访问以及仅对可用内存的有效利用。在性能和内存消耗方面,我们对大型数据集评估了我们的方法。评估表明,我们的方法仅需要内存基准中7%的时间即可计算KNN图。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号