首页> 外文期刊>IEICE transactions on information and systems >An FPGA Realization of a Random Forest with k-Means Clustering Using a High-Level Synthesis Design
【24h】

An FPGA Realization of a Random Forest with k-Means Clustering Using a High-Level Synthesis Design

机译:使用高级综合设计的具有 k -均值聚类的随机森林的FPGA实现

获取原文
           

摘要

A random forest (RF) is a kind of ensemble machine learning algorithm used for a classification and a regression. It consists of multiple decision trees that are built from randomly sampled data. The RF has a simple, fast learning, and identification capability compared with other machine learning algorithms. It is widely used for application to various recognition systems. Since it is necessary to un-balanced trace for each tree and requires communication for all the ones, the random forest is not suitable in SIMD architectures such as GPUs. Although the accelerators using the FPGA have been proposed, such implementations were based on HDL design. Thus, they required longer design time than the soft-ware based realizations. In the previous work, we showed the high-level synthesis design of the RF including the fully pipelined architecture and the all-to-all communication. In this paper, to further reduce the amount of hardware, we use k -means clustering to share comparators of the branch nodes on the decision tree. Also, we develop the krange tool flow, which generates the bitstream with a few number of hyper parameters. Since the proposed tool flow is based on the high-level synthesis design, we can obtain the high performance RF with short design time compared with the conventional HDL design. We implemented the RF on the Xilinx Inc. ZC702 evaluation board. Compared with the CPU (Intel Xeon (R) E5607 Processor) and the GPU (NVidia Geforce Titan) implementations, as for the performance, the FPGA realization was 8.4 times faster than the CPU one, and it was 62.8 times faster than the GPU one. As for the power consumption efficiency, the FPGA realization was 7.8 times better than the CPU one, and it was 385.9 times better than the GPU one.
机译:随机森林(RF)是一种用于分类和回归的集成机器学习算法。它由根据随机采样数据构建的多个决策树组成。与其他机器学习算法相比,RF具有简单,快速的学习和识别能力。它被广泛应用于各种识别系统。由于必须对每棵树进行不平衡的跟踪,并且需要对所有树进行通信,因此随机森林不适用于SIMD架构(例如GPU)。尽管已经提出了使用FPGA的加速器,但是这种实现是基于HDL设计的。因此,与基于软件的实现相比,它们需要更长的设计时间。在先前的工作中,我们展示了RF的高级综合设计,包括完整的流水线架构和全部通信。在本文中,为了进一步减少硬件数量,我们使用k -means聚类来共享决策树上分支节点的比较器。此外,我们开发了 krange工具流程,该流程生成了带有少量超级参数的比特流。由于建议的工具流程是基于高级综合设计的,因此与传统的HDL设计相比,我们可以在较短的设计时间内获得高性能的RF。我们在Xilinx Inc. ZC702评估板上实现了RF。与CPU(Intel Xeon(R)E5607处理器)和GPU(NVidia Geforce Titan)实施相比,FPGA的实现比CPU的实现快8.4倍,比GPU的实现快62.8倍。 。就功耗效率而言,FPGA的实现是CPU的7.8倍,是GPU的385.9倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号