...
首页> 外文期刊>International journal of parallel programming >DySHARQ:Dynamic Software-Defined Hardware-Managed Queues for Tile-Based Architectures
【24h】

DySHARQ:Dynamic Software-Defined Hardware-Managed Queues for Tile-Based Architectures

机译:Dysharq:动态软件定义的基于地面体系结构的硬件托管队列

获取原文
获取原文并翻译 | 示例
           

摘要

The recent trend towards tile-based manycore architectures has helped to tackle the memory wall by physically distributing memories and processing nodes. However, this introduced a data-to-task locality challenge and inter-tile communication thus often imposes significant software overhead. Thus, we proposed software-defined hardware-managed SHARQ queues that enable efficient inter-tile communication by leveraging user-defined queues with arbitrarily sized elements. To ensure (remote) processing of queued elements, SHARQ introduces an optional handler task, which is scheduled by hardware on demand. Queue management, intra- and inter-tile data transfer, and handler task invocation are entirely handled by hardware. Only rare tasks, like the dynamic queue creation at run-time, are performed in software. DySHARQ, an extension of SHARQ, enables dynamic and concurrent queue memory management and queue length adjustments to be able to adapt to application and resource requirement changes. The DySHARQ hardware is able to monitor the queue memory requirements at run-time and conditionally schedules a software-defined memory management task. It further optimizes the hardware-software interaction for local queue operations. We integrated DySHARQ into the MPI library used by the NAS benchmarks. The evaluation shows a reduction in execution time by up to 43% (compared to software) for the communication intense IS kernel in a 4 × 4 tile design on an FPGA platform with a total of 80 LEON3 cores. The dynamic memory management reduces the memory footprint by 3.75 × in a 2 × 2 design.
机译:最近的基于瓷砖的多芯架构的趋势有助于通过物理分布存储器和处理节点来解决记忆墙。然而,这引入了数据 - 任务局部地区挑战,因此互通的通信通常会施加显着的软件开销。因此,我们提出了软件定义的硬件管理的Sharq队列,通过利用具有任意尺寸的元素的用户定义的队列来实现高效的界面通信。为了确保(远程)处理排队元素,Sharq引入了一个可选的处理程序任务,该任务由硬件按需计划。队列管理,内部和区内互联数据传输和处理程序任务调用完全由硬件处理。只有罕见的任务,如运行时的动态队列创建,都在软件中执行。 Dysharq是Sharq的扩展,使动态和并发队列内存管理和队列长度调整能够适应应用程序和资源需求的变化。 Dysharq硬件能够在运行时监视队列存储器要求,并有条件地调度软件定义的内存管理任务。它进一步优化了本地队列操作的硬件软件交互。我们将Dysharq集成到NAS基准的MPI库中。评估显示,在FPGA平台上的4×4瓦片设计中,通信激烈的执行时间减少了高达43%(与软件)的核心,总共80 leon3核心。动态内存管理在2×2设计中将内存占用空间减少了3.75倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号