Large-scale high-dimensional nearest neighbor search using flash memory with in-store processing

机译：使用具有店内处理功能的闪存进行大规模高维最近邻居搜索

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Modern datasets of importance such as images, videos, protein sequences or text, usually contain very high dimensional information from the search point of view. Nearest neighbor search is one of the most fundamental building blocks in dealing with large amounts of data. It is the problem of finding points in a database that are most similar to a query data point by some distance metric. There is a large body of work in algorithms for nearest-neighbor search on large highdimensional datasets. Since these algorithms invariably involve random access to data, most existing implementations ensure that the data is stored in DRAM, and does not spill into secondary storage such as hard disks. However, the immense size of modern datasets often requires hundreds of computers to accommodate the dataset in DRAM. An alternative to such a system is a much smaller cluster that stores the dataset in flash memory (instead of DRAM) and has in-store computing capability. In this paper, we build and demonstrate the performance of highdimensional nearest-neighbor search on a flash-based system with FPGA acceleration and show that it sometimes exceeds the performance of a DRAM-based solution. We chose two example applications, images and documents, for this demonstration. Since flash storage, in comparison to DRAM, is an order of magnitude cheaper and consumes an order of magnitude less power, a flashbased solution for nearest-neighbor searches offers a viable and attractive alternative.

机译：从搜索角度来看，重要的现代数据集（例如图像，视频，蛋白质序列或文本）通常包含非常高的维度信息。最近邻居搜索是处理大量数据的最基本的构建块之一。问题在于，通过某个距离度量来找到数据库中与查询数据点最相似的点。在大型高维数据集上进行最近邻搜索的算法中有大量工作。由于这些算法总是涉及对数据的随机访问，因此大多数现有的实现方式可确保将数据存储在DRAM中，并且不会溢出到诸如硬盘之类的二级存储中。但是，现代数据集的巨大规模通常需要数百台计算机才能将数据集容纳在DRAM中。这种系统的替代方案是一个较小的群集，该群集将数据集存储在闪存中（而不是DRAM）中，并且具有店内计算功能。在本文中，我们构建并演示了在具有FPGA加速功能的基于闪存的系统上进行高维最近邻搜索的性能，并表明它有时会超过基于DRAM的解决方案的性能。在本演示中，我们选择了两个示例应用程序：图像和文档。由于与DRAM相比，闪存的价格便宜了一个数量级，而功耗却减少了一个数量级，因此用于最近邻居搜索的基于闪存的解决方案提供了一种可行且有吸引力的选择。

著录项

来源
《International Conference on Reconfigurable Computing and FPGAs》|2015年|1-8|共8页
会议地点
作者
Sang-Woo Jun; Chanwoo Chung; Arvind;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Highly flexible nearest-neighbor-search associative memory with integrated k nearest neighbor classifier, configurable parallelism and dual-storage space [J] . An Fengwei, Mihara Keisuke, Yamasaki Shogo, Japanese journal of applied physics . 2016,第4s期

机译：具有集成的k最近邻分类器，可配置的并行性和双存储空间的高度灵活的最近邻搜索关联存储器
2. High-dimensional image descriptor matching using highly parallel KD-tree construction and approximate nearest neighbor search [J] . Hu Linjia, Nooshabadi Saeid Journal of Parallel and Distributed Computing . 2019,第OCTa期

机译：使用高度并行的KD树结构和近似最近邻搜索进行高维图像描述符匹配
3. Distance Encoded Product Quantization for Approximate K-Nearest Neighbor Search in High-Dimensional Space [J] . Heo Jae-Pil, Lin Zhe, Yoon Sung-Eui IEEE Transactions on Pattern Analysis and Machine Intelligence . 2019,第9期

机译：高维空间中近似K最近邻搜索的距离编码乘积量化
4. Large-scale high-dimensional nearest neighbor search using flash memory with in-store processing [C] . Sang-Woo Jun, Chanwoo Chung, Arvind International Conference on Reconfigurable Computing and FPGAs . 2015

机译：大型高维最近邻的搜索使用闪存具有店内处理
5. Unsupervised Binary Code Learning for Approximate Nearest Neighbor Search in Large-scale Datasets. [D] . Zhang, Hao. 2016

机译：大规模数据集中近似邻居搜索的无监督二进制代码学习。
6. Privacy-Enhancing k-Nearest Neighbors Search over Mobile Social Networks [O] . Yuxi Li, Fucai Zhou, Yue Ge, 2021

机译：Privacy-Enhancation K-Interligh邻居搜索移动社交网络
7. Combining Nearest Neighbor Search with Tabu Search for Large-Scale Vehicle Routing Problem [O] . Du Lingling, He Ruhan 2012

机译：将最近邻居搜索与禁忌搜索相结合来解决大规模车辆路径问题
8. Nearest-Neighbor Non-Iterative Error Correcting Optical Associative Memory Processor [R] . Montgomery, B. L., Kumar, V. 1986

机译：最近邻非迭代纠错光学联想记忆处理器

Large-scale high-dimensional nearest neighbor search using flash memory with in-store processing

摘要

著录项

相似文献

相关主题

期刊订阅