首页> 外文会议>International Conference on Field-Programmable Technology >CHIP-KNN: A Configurable and High-Performance K-Nearest Neighbors Accelerator on Cloud FPGAs

【24h】

CHIP-KNN: A Configurable and High-Performance K-Nearest Neighbors Accelerator on Cloud FPGAs

机译：CHIP-KNN：云FPGA上的可配置和高性能的K-最近邻居加速器

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The k-nearest neighbors (KNN) algorithm is an essential algorithm in many applications, such as similarity search, image classification, and database query. With the rapid growth in the dataset size and the feature dimension of each data point, processing KNN becomes more compute and memory hungry. Most prior studies focus on accelerating the computation of KNN using the abundant parallel resource on FPGAs. However, they often overlook the memory access optimizations on FPGA platforms and only achieve a marginal speedup over a multithread CPU implementation for large datasets. In this paper, we design and implement CHIP-KNN-an HLS-based, configurable, and high-performance KNN accelerator-which optimizes the off-chip memory access on cloud FPGAs with multiple DRAM or HBM (high-bandwidth memory) banks. CHIP-KNN is configurable for all essential parameters used in the algorithm, including the size of the search dataset, the feature dimension of each data point, the distance metric, and the number of nearest neighbors - K. To optimize its performance, we build an analytical performance model to explore the design space and balance the computation and memory access performance. Given a user configuration of the KNN parameters, our tool can automatically generate the optimal accelerator design on the given FPGA platform. Our experimental results on the Nimbix cloud computing platform show that: Compared to a 16-thread CPU implementation, CHIP-KNN on the Xilinx Alveo U200 FPGA board with four DRAM banks and U280 FPGA board with HBM achieves an average of 7.5x and 19.8x performance speedup, and 6.1x and 16.0x performance/dollar improvement.

机译：K-CORMATE邻居（KNN）算法是许多应用中的基本算法，例如相似性搜索，图像分类和数据库查询。随着数据集大小的快速增长和每个数据点的特征维度，处理KNN变得更加计算和饥饿。大多数事先研究专注于使用FPGA上的丰富并行资源加速KNN的计算。但是，它们经常在FPGA平台上忽略内存访问优化，并且只能在大型数据集的多线程CPU实现上实现边际加速。在本文中，我们设计和实现芯片KNN-AN基于HLS的，可配置和高性能的KNN加速器 - 在具有多个DRAM或HBM（高带宽存储器）库中，优化云FPGA上的片外存储器访问。芯片KNN可配置用于算法中使用的所有基本参数，包括搜索数据集的大小，每个数据点的特征维度，距离度量和最近邻居的数量 - K.要优化其性能，我们构建探索设计空间的分析性能模型及计算和内存访问性能。鉴于KNN参数的用户配置，我们的工具可以在给定的FPGA平台上自动生成最佳加速器设计。我们的Nimbix云计算平台上的实验结果表明：与16线CPU实现相比，Xilinx Alveo U200 FPGA板上的芯片KNN，具有四个DRAM银行和U280 FPGA板，具有HBM的平均值为7.5倍和19.8倍性能加速，6.1x和16.0x性能/美元改进。

著录项

来源
《International Conference on Field-Programmable Technology 》|2020年|139-147|共9页
会议地点
作者
Alec Lu; Zhenman Fang; Nazanin Farahpour; Lesley Shannon;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Measurement; Cloud computing; Databases; Tools; Space exploration; Classification algorithms; Acceleration;

机译：测量;云计算;数据库;工具;空间探索;分类算法;加速;

相似文献

外文文献
中文文献
专利

1. A configurable multiplex data transfer model for asynchronous and heterogeneous FPGA accelerators on single DMA device [J] . Huang Zhangqin, Zhang Shuo, Gao Han, Microprocessors and microsystems . 2020 ,第Sepa期

机译：单个DMA设备异步和异构FPGA加速器的可配置多路复用数据传输模型
2. Exploring Shared Virtual Memory for FPGA Accelerators with a Configurable IOMMU [J] . Vogel Pirmin, Marongiu Andrea, Benini Luca IEEE Transactions on Computers . 2019 ,第4期

机译：使用可配置的IOMMU探索FPGA加速器的共享虚拟内存
3. Exploring Shared Virtual Memory for FPGA Accelerators with a Configurable IOMMU [J] . Vogel Pirmin, Marongiu Andrea, Benini Luca IEEE Transactions on Computers . 2019 ,第4期

机译：使用可配置的IOMMU探索FPGA加速器的共享虚拟内存
4. Hardware accelerators for the K-nearest neighbor algorithm using high level synthesis [C] . Dunia Jamma, Omar Ahmed, Shawki Areibi, International Conference on Microelectronics . 2017

机译：使用高级综合的K近邻算法的硬件加速器
5. An FPGA-Based Hardware Accelerator for K-Nearest Neighbor Classification for Machine Learning [D] . Mohsin, Mokhles Aamel. 2017

机译：基于FPGA的硬件加速器，用于机器学习的K近邻分类
6. Privacy Preserving k-Nearest Neighbor for Medical Diagnosis in e-Health Cloud [O] . Jeongsu Park, Dong Hoon Lee 2018

机译：保留隐私的k最近邻居在e-Health Cloud中进行医学诊断
7. IMORC: An Infrastructure and Architecture Template for Implementing High-Performance Reconfigurable FPGA Accelerators [O] . Tobias Schumacher, Christian Plessl, Marco Platzner 2014

机译：IMORC：用于实现高性能可重配置FPGA加速器的基础架构和体系结构模板

CHIP-KNN: A Configurable and High-Performance K-Nearest Neighbors Accelerator on Cloud FPGAs

摘要

著录项

相似文献

相关主题

期刊订阅