Multiple clock domain microarchitecture design and analysis.

机译：多时钟域微架构的设计与分析。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

As clock frequency increases and feature size decreases, clock distribution and skew tolerance present growing challenges to the designers of singly-clocked, globally synchronous processors. We describe a globally-asynchronous, locally-synchronous (GALS) approach, which we call a Multiple Clock Domain ( MCD) processor, in which the chip is divided into several clock domains, within which independent voltage and frequency scaling can be performed. Boundaries between domains are chosen to exploit existing queues, thereby minimizing inter-domain synchronization costs. We propose four clock domains corresponding to the front end (including L1 instruction cache), integer units, floating-point units, and load-store units (including L1 data cache and unified L2 cache).; In addition, we quantify the potential energy savings of a specific MCD processor based on the Alpha 21264 microprocessor using off-line analysis of traces of a broad range of applications to identify the potential energy savings. With the results from this off-line algorithm as a benchmark, we describe the design, analysis and performance of a realistic on-line frequency/voltage control algorithm which achieves on average a 19.0% reduction in Energy Per Instruction (EPI), a 3.2% increase in Cycles Per Instruction (CPI), and a 16.7% improvement in the Energy-Delay product, with a Power Savings to Performance Degradation ratio of 4.6. This Energy-Delay product improvement is 85.5% of what was achieved using the off-line algorithm. All of our results (from both the off-line and online algorithms) were achieved using a broad mix of compute bound, memory bound, and rate-based applications from the MediaBench, Olden, and Spec2000 benchmark suites.; We also demonstrate that the inherent characteristics of an MCD microarchitecture allow internal processor complexity to be dynamically traded for frequency on a per-domain basis. Simply configuring the MCD processor once per application increases performance 17.6%, on average, compared to the best fully synchronous design. When adapting to application phases, performance improves by 20.4%.; These techniques provide an enabling technology which will allow future processor designs to achieve higher levels of scalability, performance, and energy efficiency than would otherwise be possible with a monolithic synchronous processor.

机译：随着时钟频率的增加和功能部件尺寸的减小，时钟分配和偏斜容限对单时钟全局同步处理器的设计人员构成了越来越大的挑战。我们描述了一种全局异步，本地同步（GALS）方法，我们将其称为多时钟域（ MCD ）处理器，其中该芯片分为多个时钟域，可以在其中执行独立的电压和频率缩放。选择域之间的边界来利用现有队列，从而最大程度地减少域间同步成本。我们提出了四个时钟域，分别对应于前端（包括L1指令高速缓存），整数单元，浮点单元和负载存储单元（包括L1数据高速缓存和统一L2高速缓存）。此外，我们通过对大量应用的痕迹进行离线分析来确定潜在的节能量，从而基于Alpha 21264微处理器对特定MCD处理器的潜在节能量进行量化。以该离线算法的结果为基准，我们描述了一种实际的在线频率/电压控制算法的设计，分析和性能，该算法平均可将每条指令的能量（EPI）降低19.0％，即3.2％每条指令的周期数（CPI）增长了％，能耗产品的能耗提高了16.7％，节能性能下降比为4.6。与离线算法相比，此Energy-Delay产品改进达到了85.5％。我们所有的结果（来自离线算法和在线算法）都是使用MediaBench，Olden和Spec2000基准测试套件中大量的计算范围，内存范围和基于速率的应用程序实现的。我们还证明了MCD微体系结构的固有特性允许内部处理器复杂性在每个域的基础上动态地交换频率。与最佳的完全同步设计相比，每个应用程序只需配置一次MCD处理器，性能平均提高17.6％。当适应应用程序阶段时，性能可提高20.4％。这些技术提供了一种使能技术，与单片同步处理器相比，未来的处理器设计可以实现更高水平的可扩展性，性能和能效。

著录项

作者
Semeraro, Greg Philip.;
展开▼
作者单位

The University of Rochester.;

展开▼
授予单位 The University of Rochester.;
学科 Engineering Electronics and Electrical.; Computer Science.
学位 Ph.D.
年度 2003
页码 209 p.
总页数 209
原文格式 PDF
正文语种 eng
中图分类无线电电子学、电信技术;自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. A Scan Cell Design for Scan-Based Debugging of an SoC With Multiple Clock Domains [J] . Yi H., Kundu S., Cho S., Circuits and Systems II: Express Briefs, IEEE Transactions on . 2010,第7期

机译：具有多个时钟域的基于扫描的SoC调试的扫描单元设计
2. Test Wrapper Design and Optimization Under Power Constraints for Embedded Cores With Multiple Clock Domains [J] . Xu Q, Nicolici N, Chakrabarty K. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems . 2007,第8期

机译：功率约束下具有多个时钟域的嵌入式内核的测试封装设计和优化
3. Handling Multiple Clock Domains in Scan Design [J] . Samy Makar Integrated System Design . 1999,第11期

机译：在扫描设计中处理多个时钟域
4. Compiler-directed frequency and voltage scaling for a multiple clock domain microarchitecture [C] . Arun Rangasamy, Rahul Nagpal, Y.N. Srikant Conference on Computing frontiers . 2008

机译：面向多个时钟域微体系结构的编译器控制的频率和电压缩放
5. Inter-Module Interfacing Techniques for SoCs with Multiple Clock Domains to Address Challenges in Modern Deep Sub-Micron Technologies. [D] . Hasan, Syed Rafay. 2009

机译：具有多个时钟域的SoC的模块间接口技术，可应对现代深亚微米技术中的挑战。
6. Multiple functional domains of Tat the trans-activator of HIV-1 defined by mutational analysis. [O] . M Kuppuswamy, T Subramanian, A Srinivasan, 1989

机译：Tat的多个功能域HIV-1的反式激活因子通过突变分析确定。
7. Dynamic Frequency and Voltage Control for a Multiple Clock Domain Microarchitecture [O] . Greg Semeraro, David H. Albonesi, Steven G. Dropsho, 2002

机译：多时钟域微体系结构的动态频率和电压控制
8. Susceptibility of Redundant Versus Singular Clock Domains Implemented in SRAM-Based FPGA TMR Designs. [R] . Berg, M. D., LaBel, K. A., Pellish, J. 2016

机译：在基于sRam的FpGa TmR设计中实现冗余与奇异时钟域的易感性。

Multiple clock domain microarchitecture design and analysis.

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅