首页> 外文期刊>Microprocessors and microsystems >Recommender system implementations for embedded collaborative filtering applications
【24h】

Recommender system implementations for embedded collaborative filtering applications

机译:嵌入式协作过滤应用程序的推荐系统实现

获取原文
获取原文并翻译 | 示例

摘要

This paper starts proposing a complete recommender system implemented on reconfigurable hardware with the purpose of testing on-chip, low-energy embedded collaborative filtering applications. Although the computing time is lower than the one obtained from usual multicore microprocessors, this proposal has the advantage of providing an approach to solve any prediction problem based on collaborative filtering by using an off-line, highly-portable light computing environment. This approach has been successfully tested with state-of-the-art datasets. Next, as a result of improving certain tasks related to the on-chip recommender system, we propose a custom, fine-grained parallel circuit for quick matrix multiplication with floating-point numbers. This circuit was designed to accelerate the predictions from the model obtained by the recommender system, and tested with two small datasets for experimental purposes. The accelerator is built from two levels of parallelism. On the one hand, several predictions run in parallel through the simultaneous multiplication of different vectors of two matrices. On the other hand, the operation of each vector is executed in parallel by multiplying pairs of floating-point values to later add the corresponding results in parallel as well. This circuit was compared with other approaches designed for the same purpose: circuits built using automatized tools of high-level synthesis, a general-purpose microprocessor, and high-performance graphical processing units. The performance of the prediction accelerator in terms of time surpassed that of the other approaches. We also evaluated the scalability of the circuit to practical problems using the high-level synthesis approach, and confirmed that implementations based on reconfigurable hardware allow acceptable speedups of multi-core processors. (C) 2020 Elsevier B.V. All rights reserved.
机译:本文开始提出在可重新配置硬件上实现的完整推荐系统,目的是测试片上,低能量嵌入式的协作滤波应用。虽然计算时间低于来自通常的多核微常微处理器的那个,但是该提议具有提供一种基于使用离线,高便携的光计算环境基于协作滤波来解决任何预测问题的方法的优点。使用最先进的数据集成功测试了这种方法。接下来,由于改进了与片上推荐系统相关的某些任务,我们提出了一种定制,细粒度的并行电路,用于快速矩阵乘法,具有浮点数。该电路旨在从推荐系统获得的模型中加速预测,并用两个小型数据集测试,以进行实验目的。加速器是由两个水平的并行性构建的。一方面,通过同时乘法的两个矩阵的同时乘法并行运行几个预测。另一方面,通过将浮点值的对乘以以稍后并行地添加相应的结果,并行地执行每个矢量的操作。将该电路与用于相同目的设计的其他方法进行比较:使用高级合成的自动化工具,通用微处理器和高性能图形处理单元建造的电路。预测加速器在时间方面的性能超过了其他方法。我们还使用高级合成方法评估了电路对实际问题的可扩展性,并确认了基于可重新配置硬件的实现允许多核处理器的可接受加速。 (c)2020 Elsevier B.v.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号