...
首页> 外文期刊>Journal of signal processing systems for signal, image, and video technology >MAHASIM: Machine-Learning Hardware Acceleration Using a Software-Defined Intelligent Memory System
【24h】

MAHASIM: Machine-Learning Hardware Acceleration Using a Software-Defined Intelligent Memory System

机译:Mahasim:使用软件定义的智能内存系统的机器学习硬件加速

获取原文
获取原文并翻译 | 示例
           

摘要

As computations in machine-learning applications are increasing simultaneously along the size of datasets, the energy and performance costs of data movement dominate that of compute. This issue is more pronounced in embedded systems with limited resources and energy. Although near-data-processing (NDP) is pursued as an architectural solution, comparatively less attention has been focused on how to scale NDP for larger-scale embedded machine learning applications (e.g., speech and motion processing). We propose machine-learning hardware acceleration using a software-defined intelligent memory system (Mahasim). Mahasim is a scalable NDP-based memory system, in which application performance scales with the size of data. The building blocks of Mahasim are the programable memory slices, supported by data partitioning, compute-aware memory allocation, and an independent in-memory execution model. For recurrent neural networks, Mahasim shows up to 537.95 GFLOPS/W energy efficiency and 3.9x speedup, when the size of the system increases from 2 to 256 memory slices, which indicates that Mahasim favors larger problems.
机译:随着机器学习应用中的计算沿着数据集的大小同时增加,数据移动的能量和性能成本占据计算的能量和性能。在具有有限资源和能量的嵌入式系统中,此问题更加明显。虽然近数据处理(NDP)被追求为架构解决方案,但相对较少的关注,专注于如何为大规模嵌入式机器学习应用(例如,语音和运动处理)缩放NDP。我们使用软件定义的智能内存系统(Mahasim)提出机器学习硬件加速。 Mahasim是一个可扩展的基于NDP的内存系统,其中应用程序性能尺寸具有数据大小。 Mahasim的构建块是可编程存储器切片,由数据分区,计算感知内存分配和独立的内存中执行模型支持。对于经常性的神经网络,Mahasim显示出高达537.95 GFLOPS / W能效和3.9倍的加速,当系统的大小从2到256内存切片增加时,这表明Mahasim有利于更大的问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号