【24h】

Cobra: A comprehensive bundle-based reliable architecture

机译:眼镜蛇:基于包的全面可靠架构

获取原文
获取外文期刊封面目录资料

摘要

The declining robustness of transistors and their ever-denser integration threatens the dependability of future microprocessors. Classic multicores offer a simple solution to overcome hardware defects: faulty processors can be disabled without affecting the rest of the system. However, this approach becomes quickly an impractical solution at high fault rates. Recently, distributed computer architectures have been proposed to mitigate the effects of faulty transistors by utilizing finegrained hardware reconfiguration, managed by fully decoupled control logic. Unfortunately, such solutions trade flexibility for execution locality, and thus they do not scale to large systems. To overcome this issue we propose Cobra, a distributed, scalable, highly parallel reliable architecture. Cobra is a service-based architecture where groups of dynamic instructions flow independently through the system, making use of the available hardware resources. Cobra organizes the system's units dynamically using a novel resource assignment that preserves locality and limits communication overhead. Our experiments show that Cobra is extremely dependable, and outperforms classic multicores when subjected to 5 or more defects per 100 million transistors. We also show that Cobra operates 80% faster than previous state-of-the-art solutions on multi-programmed SPEC CPU2006 workloads and it improves cache hit rate by up to 62%. Our runtime fault detection techniques have a performance impact of only 3%.
机译:晶体管的鲁棒性下降以及它们日益密集的集成度威胁着未来微处理器的可靠性。经典多核提供了解决硬件缺陷的简单解决方案:可以禁用有故障的处理器,而不会影响系统的其余部分。但是,这种方法很快就成为高故障率下不切实际的解决方案。最近,已经提出了分布式计算机体系结构,以通过利用由完全解耦的控制逻辑进行管理的细粒度硬件重新配置来减轻故障晶体管的影响。不幸的是,这样的解决方案为执行局部性而牺牲了灵活性,因此它们不能扩展到大型系统。为了克服这个问题,我们提出了Cobra,这是一种分布式,可扩展,高度并行的可靠体系结构。 Cobra是一种基于服务的体系结构,其中动态指令组通过可用硬件资源独立地在系统中流动。眼镜蛇使用新颖的资源分配来动态地组织系统的各个单元,这些资源分配可以保留局部性并限制通信开销。我们的实验表明,Cobra具有极高的可靠性,并且每1亿个晶体管遭受5个或更多个缺陷时,它们的性能优于经典的多核。我们还展示了Cobra在多程序SPEC CPU2006工作负载上的运行速度比以前的最新解决方案快80%,并且可将缓存命中率提高多达62%。我们的运行时故障检测技术对性能的影响仅为3%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号