首页> 外文期刊>ACM Transactions on Architecture and Code Optimization >CODA: Enabling Co-location of Computation and Data for Multiple GPU Systems
【24h】

CODA: Enabling Co-location of Computation and Data for Multiple GPU Systems

机译:CODA:使多个GPU系统的计算和数据能够实现连接和数据

获取原文
获取原文并翻译 | 示例
       

摘要

To exploit parallelism and scalability of multiple GPUs in a system, it is critical to place compute and data together. However, two key techniques that have been used to hide memory latency and improve thread level parallelism (TLP), memory interleaving, and thread block scheduling, in traditional GPU systems are at odds with efficient use of multiple GPUs. Distributing data across multiple GPUs to improve overall memory bandwidth utilization incurs high remote traffic when the data and compute are misaligned. Nondeterministic thread block scheduling to improve compute resource utilization impedes co-placement of compute and data. Our goal in this work is to enable co-placement of compute and data in the presence of fine-grained interleaved memory with a low-cost approach.
机译:为了利用系统中多个GPU的并行性和可扩展性,将计算和数据放在一起是至关重要的。 然而,用于隐藏内存延迟和改善线程平行(TLP),传统GPU系统中的内部电平行度(TLP),存储器交织和线程调度的两个关键技术是有效利用多个GPU的赔率。 在多个GPU上分布数据以改善整体内存带宽利用率,当数据和计算未对准时会引发高远程流量。 不确定的线程块调度,以提高计算资源利用率阻碍了计算和数据的共同放置。 我们在这项工作中的目标是在具有低成本方法的情况下,在存在细粒度交错的内存的情况下,能够协同放置计算和数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号