Exploring System Availability During Software-Based Self-Testing of Multi-core CPUs

Michael A. Skitsas; Chrysostomos A. Nicopoulos; Maria K. Michael

首页> 外文期刊>Journal of Electronic Testing: Theory and Applications: Theory and Applications >Exploring System Availability During Software-Based Self-Testing of Multi-core CPUs

【24h】

Exploring System Availability During Software-Based Self-Testing of Multi-core CPUs

机译：在基于软件的多核CPU的自我测试期间探索系统可用性

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Abstract As technology scales, the increased vulnerability of modern systems due to unreliable components becomes a major problem in the era of multi-/many-core architectures. Recently, several on-line testing techniques have been proposed, aiming towards error detection of wear-out/aging-related defects that can appear during the lifetime of a system. In this work, firstly we investigate the relation between system test latency and test-time overhead in multi-/many-core systems with shared Last-Level Cache (LLC) for periodic Software-Based Self-Testing (SBST), under different test scheduling policies. Secondly, we propose a new methodology aiming to reduce the extra overhead related to testing that is incurred as the system scales up (i.e., the number of on-chip cores increases). The investigated scheduling policies primarily vary the number of cores concurrently under test in the overall system test session. Our extensive, workload-driven dynamic exploration reveals that there is an inverse relationship between the two test measures; as the number of cores concurrently under test increases, system test latency decreases, but at the cost of significantly increased test time, which sacrifices system availability for the actual workloads. Under given system test latency constraints, which dictate the recovery time in the event of error detection, our exploration framework identifies the scheduling policy under which the overall test-time overhead is minimized and, hence, system availability is maximized. For the evaluation of the proposed techniques, multi-/many-core systems consisting of 16 and 64 cores are explored in a full-system, execution-driven simulation framework running multi-threaded PARSEC workloads.

机译：摘要随着技术规模，由于不可靠的组件由于不可靠的组件而增加的现代系统的脆弱性成为多/多核心架构的时代的主要问题。最近，已经提出了几种在线测试技术，针对在系统的寿命期间出现的耐磨/衰老相关缺陷的错误检测。在这项工作中，首先，我们研究了基于周期性软件的自我测试（SBST）的共享了最后一级缓存（LLC）的多/多核系统中系统测试延迟和测试时间开销之间的关系。安排策略。其次，我们提出了一种新的方法，旨在减少与系统缩放所产生的测试相关的额外开销（即，片上核心的数量增加）。调查的调度政策主要在整个系统测试会话中同时改变核心数。我们广泛的工作负载驱动的动态探讨显示，两种测试措施之间存在反比关系;随着核心的核心数量，系统测试延迟减少，但在显着增加的测试时间的成本下，牺牲了实际工作负载的系统可用性。在给定的系统测试延迟约束，在错误检测时决定恢复时间，我们的探索框架标识了调度策略，在该调度策略下，整个测试时间开销最小化，因此最大化系统可用性。为了评估所提出的技术，在运行多线程Parsec工作负载的全系统，执行驱动的仿真框架中探讨了由16和64个核心组成的多/多核系统。

著录项

来源
《Journal of Electronic Testing: Theory and Applications: Theory and Applications》 |2018年第1期|共15页
作者
Michael A. Skitsas; Chrysostomos A. Nicopoulos; Maria K. Michael;
展开▼
作者单位

KIOS Research and Innovation Center of Excellence;

Department of Electrical and Computer Engineering University of Cyprus;

KIOS Research and Innovation Center of Excellence;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类一般性问题;
关键词
On-line testing; Software-Based Self-Testing; System availability;

机译：在线测试;基于软件的自我测试;系统可用性;

相似文献

外文文献
中文文献
专利

1. Exploring System Availability During Software-Based Self-Testing of Multi-core CPUs [J] . Michael A. Skitsas, Chrysostomos A. Nicopoulos, Maria K. Michael Journal of Electronic Testing: Theory and Applications: Theory and Applications . 2018,第1期

机译：在基于软件的多核CPU的自我测试期间探索系统可用性
2. Application performance analysis and efficient execution on systems with multi-core CPUs, GPUs and MICs: a case study with microscopy image analysis [J] . Teodoro George, Kurc Tahsin, Andrade Guilherme, International Journal of High Performance Computing Applications . 2017,第1期

机译：在具有多核CPU，GPU和MIC的系统上的应用程序性能分析和高效执行：带有显微镜图像分析的案例研究
3. Assessing the Performance and Energy Usage of Multi-CPUs, Multi-Core and Many-Core Systems : The MMP Image Encoder Case Study [J] . Pedro M.M. Pereira, Patricio Domingues, Nuno M. M. Rodrigues, International Journal of Distributed and Parallel Systems . 2016,第5期

机译：评估多CPU，多核和多核系统的性能和能耗：MMP图像编码器案例研究
4. Exploration of system availability during software-based self-testing in many-core systems under test latency constraints [C] . Skitsas M.A., Nicopoulos C.A., Michael M.K. IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems . 2014

机译：在测试延迟约束下，在多核系统中基于软件的自测试期间探索系统可用性
5. Yield, Cost, Reliability, and Availability of Multi-Core System-on-Chips. [D] . Shamshiri, Saeed. 2011

机译：多核芯片级芯片的产量，成本，可靠性和可用性。
6. Application Performance Analysis and Efficient Execution on Systems with multi-core CPUs GPUs and MICs: A Case Study with Microscopy Image Analysis [O] . George Teodoro, Tahsin Kurc, Guilherme Andrade, -1

机译：具有多核CPUGPU和MIC的系统上的应用程序性能分析和高效执行：以显微镜图像分析为例
7. Accelerating Mixed-Abstraction SystemC Models on Multi-Core CPUs and GPUs [O] . Kaushik Anirudh Mohan 2014

机译：在多核CPU和GPU上加速混合抽象SystemC模型

Exploring System Availability During Software-Based Self-Testing of Multi-core CPUs

摘要

著录项

相似文献

相关主题

期刊订阅