首页> 外文会议>24th ACM international conference on supercomputing 2010 >SAMS Multi-Layout Memory: Providing Multiple Views of Data to Boost SIMD Performance
【24h】

SAMS Multi-Layout Memory: Providing Multiple Views of Data to Boost SIMD Performance

机译:SAMS多布局内存:提供多个数据视图以提高SIMD性能

获取原文
获取原文并翻译 | 示例

摘要

We propose to bridge the discrepancy between data representations in memory and those favored by the SIMD processor by customizing the low-level address mapping. To achieve this, we employ the extended Single-Affiliation Multiple-Stride (SAMS) parallel memory scheme at an appropriate level in the memory hierarchy. This level of memory provides both Array of Structures (AoS) and Structure of Arrays (SoA) views for the structured data to the processor, appearing to have maintained multiple layouts for the same data. With such multi-layout memory, optimal SIMDization can be achieved. Our synthesis results using TSMC 90nm CMOS technology indicate that the SAMS Multi-Layout Memory system has efficient hardware implementation, with a critical path delay of less than Ins and moderate hardware overhead. Experimental evaluation based on a modified IBM Cell processor model suggests that our approach is able to decrease the dynamic instruction count by up to 49% for a selection of real applications and kernels. Under the same conditions, the total execution time can be reduced by up to 37%.
机译:我们建议通过自定义低级地址映射来弥合内存中的数据表示与SIMD处理器偏爱的数据表示之间的差异。为实现此目的,我们在内存层次结构的适当级别采用了扩展的单联多步(SAMS)并行内存方案。此内存级别为处理器提供了结构化数据的结构阵列(AoS)和阵列结构(SoA)视图,它们似乎为同一数据维护了多种布局。利用这种多层布局存储器,可以实现最佳的SIMD化。我们使用台积电90nm CMOS技术的综合结果表明,SAMS多布局存储系统具有高效的硬件实现,关键路径延迟小于Ins,并且硬件开销适中。基于修改后的IBM Cell处理器模型的实验评估表明,对于选择实际的应用程序和内核,我们的方法能够将动态指令数最多减少49%。在相同条件下,总执行时间最多可减少37%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号