首页> 外文会议>IEEE International Symposium on High Performance Computer Architecture >Reliability-Aware Data Placement for Heterogeneous Memory Architecture
【24h】

Reliability-Aware Data Placement for Heterogeneous Memory Architecture

机译:异构内存体系结构的可靠性感知数据放置

获取原文

摘要

System reliability is a first-class concern as technology continues to shrink, resulting in increased vulnerability to traditional sources of errors such as single event upsets. By tracking access counts and the Architectural Vulnerability Factor (AVF), application data can be partitioned into groups based on how frequently it is accessed (its "hotness") and its likelihood to cause program execution error (its "risk"). This is particularly useful for memory systems which exhibit heterogeneity in their performance and reliability such as Heterogeneous Memory Architectures - with a typical configuration combining slow, highly reliable memory with faster, less reliable memory. This work demonstrates that current state of the art, performance-focused data placement techniques affect reliability adversely. It shows that page risk is not necessarily correlated with its hotness; this makes it possible to identify pages that are both hot and low risk, enabling page placement strategies that can find a good balance of performance and reliability. This work explores heuristics to identify and monitor both hotness and risk at run-time, and further proposes static, dynamic, and program annotation-based reliability-aware data placement techniques. This enables an architect to choose among available memories with diverse performance and reliability characteristics. The proposed heuristic-based reliability-aware data placement improves reliability by a factor of 1.6x compared to performance-focused static placement while limiting the performance degradation to 1%. A dynamic reliability-aware migration scheme, which does not require prior knowledge about the application, improves reliability by a factor of 1.5x on average while limiting the performance loss to 4.9%. Finally, program annotation-based data placement improves the reliability by 1.3x at a performance cost of 1.1%.
机译:随着技术的不断发展,系统的可靠性成为头等大事,导致对传统错误源(如单事件故障)的脆弱性增加。通过跟踪访问次数和体系结构脆弱性因素(AVF),可以根据访问数据的频率(“热度”)和引起程序执行错误的可能性(“风险”)将应用程序数据分为几类。这对于在性能和可靠性方面表现出异质性的存储系统(例如异构存储体系结构)特别有用,其典型配置将慢速,高度可靠的内存与更快,可靠性更差的内存结合在一起。这项工作表明,以性能为中心的当前数据放置技术的最新状态会对可靠性产生不利影响。它表明页面风险不一定与其热度相关。这样就可以识别风险高和风险低的页面,从而实现可以在性能和可靠性之间找到良好平衡的页面放置策略。这项工作探索了启发式方法,以在运行时识别和监视热度和风险,并进一步提出了基于静态,动态和基于程序注释的可靠性感知数据放置技术。这使架构师可以在具有各种性能和可靠性特征的可用存储器中进行选择。与基于性能的静态放置相比,建议的基于启发式的可靠性感知数据放置将可靠性提高了1.6倍,同时将性能下降限制在1%以内。一种动态的可靠性感知迁移方案,不需要有关应用程序的先验知识,可将可靠性平均提高1.5倍,同时将性能损失限制在4.9%。最后,基于程序注释的数据放置以1.1%的性能成本将可靠性提高了1.3倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号