...
首页> 外文期刊>Parallel and Distributed Systems, IEEE Transactions on >Compiler-Assisted Data Distribution and Network Configuration for Chip Multiprocessors
【24h】

Compiler-Assisted Data Distribution and Network Configuration for Chip Multiprocessors

机译:芯片多处理器的编译器辅助数据分发和网络配置

获取原文
获取原文并翻译 | 示例

摘要

Data access latency, a limiting factor in the performance of chip multiprocessors, grows significantly with the number of cores in nonuniform cache architectures with distributed cache banks. To mitigate this effect, we use a compiler-based approach to leverage data access locality, choose an optimized data placement and efficiently configure the on-chip network. The proposed experimental compiler framework employs novel compilation techniques to discover and represent multithreaded memory access patterns (MMAPs). At runtime, symbolic MMAPs are resolved and used by a partitioning algorithm to choose a partition of allocated memory blocks among the forked threads in the analyzed application. This partition is used to enforce data ownership by associating the data with the core that executes the thread owning the data. Based on the partition, the communication pattern of the application can be extracted. We demonstrate how this information can be used in an experimental architecture to accelerate applications. In particular, our compiler assisted data partitioning approach shows a 20 percent speedup over shared caching and 5 percent speedup over the closest runtime approximation, first touch. By leveraging the communication pattern we can achieve a comparable performance to a system that uses a complex centralized network configuration system at runtime. Thus, our final system saves significant runtime complexity and achieves an 5.1 percent additional speedup through the addition of the reconfigurable network.
机译:数据访问延迟是芯片多处理器性能的一个限制因素,它随着具有分布式缓存库的非统一缓存体系结构中内核的数量而显着增加。为了减轻这种影响,我们使用基于编译器的方法来利用数据访问位置,选择优化的数据放置并有效地配置片上网络。提出的实验性编译器框架采用新颖的编译技术来发现和表示多线程内存访问模式(MMAP)。在运行时,符号MMAP被解析,并由分区算法使用,以在分析的应用程序中的分叉线程中选择已分配内存块的分区。该分区用于通过将数据与执行拥有该数据的线程的核心相关联来强制执行数据所有权。基于该分区,可以提取应用程序的通信模式。我们演示了如何在实验性体系结构中使用此信息来加速应用程序。特别是,我们的编译器辅助数据分区方法显示,共享缓存的速度提高了20%,而最接近的运行时近似(第一次接触)的速度提高了5%。通过利用通信模式,我们可以在运行时获得与使用复杂的集中式网络配置系统的系统相当的性能。因此,我们的最终系统节省了可观的运行时复杂性,并通过添加可重新配置的网络实现了5.1%的额外加速。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号