Automatic design of efficient application-centric architectures.

机译：自动设计高效的以应用程序为中心的体系结构。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

As the market for embedded devices continues to grow, the demand for high performance, low cost, and low power computation grows as well. Many embedded applications perform computationally intensive tasks such as processing streaming video or audio, wireless communication, or speech recognition. Often, performance requirements are on the order of 10-100 billion operations per second and must be implemented within tight power budgets on the order of 100 mW. Typically, general purpose processors are not able to meet these performance and power requirements. Custom hardware in the form of loop accelerators are often used to execute the compute-intensive portions of these applications because they can achieve significantly higher levels of performance and power efficiency.;Automated hardware synthesis from high level specifications is a key technology used in designing these accelerators, because the resulting hardware is correct by construction, easing verification and greatly decreasing time-to-market in the quickly evolving embedded domain. In this dissertation, a compiler-directed approach is used to design a loop accelerator from a C specification and a throughput requirement. The compiler analyzes the loop and generates a virtual architecture containing sufficient resources to sustain the required throughput. Next, a software pipelining scheduler maps the operations in the loop to the virtual architecture. Finally, the accelerator datapath is derived from the resulting schedule.;In this dissertation, synthesis of different types of loop accelerators is investigated. First, the system for synthesizing single loop accelerators is detailed. In particular, a scheduler is presented that is aware of the effects of its decisions on the resulting hardware, and attempts to minimize hardware cost. Second, synthesis of multifunction loop accelerators, or accelerators capable of executing multiple loops, is presented. Such accelerators exploit coarse-grained hardware sharing across loops in order to reduce overall cost. Finally, synthesis of post-programmable accelerators is presented, allowing changes to be made to the software after an accelerator has been created.;The tradeoffs between the flexibility, cost, and energy efficiency of these different types of accelerators are investigated. Automatically synthesized loop accelerators are capable of achieving order-of-magnitude gains in performance, area efficiency, and power efficiency over processors, and programmable accelerators allow software changes while maintaining highly efficient levels of computation.

机译：随着嵌入式设备市场的持续增长，对高性能，低成本和低功耗计算的需求也随之增长。许多嵌入式应用程序执行计算密集型任务，例如处理流视频或音频，无线通信或语音识别。通常，性能要求大约为每秒10至1000亿次操作，并且必须在100 mW左右的严格功率预算内实现。通常，通用处理器无法满足这些性能和功耗要求。循环加速器形式的自定义硬件通常用于执行这些应用程序的计算密集型部分，因为它们可以实现更高水平的性能和电源效率。高水平规格的自动硬件综合是设计这些应用程序的关键技术加速器，因为最终的硬件在构造上是正确的，简化了验证，并大大缩短了快速发展的嵌入式领域的上市时间。本文采用编译器指导的方法，根据C规范和吞吐量要求设计了循环加速器。编译器分析循环并生成一个虚拟架构，其中包含足够的资源来维持所需的吞吐量。接下来，软件管道调度程序将循环中的操作映射到虚拟体系结构。最后，从生成的调度表中导出加速器数据路径。本文对不同类型的循环加速器进行了研究。首先，详细说明用于合成单回路加速器的系统。特别地，提出了一种调度器，该调度器知道其决策对所得硬件的影响，并试图使硬件成本最小化。其次，介绍了多功能循环加速器或能够执行多个循环的加速器的综合。此类加速器利用了跨循环的粗粒度硬件共享，以降低总体成本。最后，介绍了后编程加速器的综合，允许在创建加速器后对软件进行更改。;研究了这些不同类型的加速器在灵活性，成本和能效之间的权衡。自动合成的循环加速器能够在处理器上实现性能，面积效率和电源效率的数量级增益，可编程加速器允许在保持高效计算水平的同时进行软件更改。

著录项

作者
Fan, Kevin C.;
展开▼
作者单位

University of Michigan.;

展开▼
授予单位 University of Michigan.;
学科 Engineering Electronics and Electrical.;Computer Science.
学位 Ph.D.
年度 2008
页码 131 p.
总页数 131
原文格式 PDF
正文语种 eng
中图分类无线电电子学、电信技术;自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Software for Multi-Core Processor-Based Architectures. Automatic Detection of Concurrency Errors [J] . Fernando Emmanuel Frati Journal of Computer Science and Technology . 2015,第2期

机译：用于基于多核处理器的体系结构的软件。自动检测并发错误
2. Application-centric energy-efficient Ethernet with quality of service support [J] . Popescu I., Miyazaki T., Chino M., Electronics Letters . 2015,第15期

机译：以应用为中心的节能以太网，提供服务质量支持
3. DODO: an efficient orthologous genes assignment tool based on domain architectures. Domain based ortholog detection [J] . Ting-wen Chen, Timothy H Wu, Wailap V Ng, BMC Bioinformatics . 2010,第SUPPLEMENTa7期

机译：DODO：基于域架构的高效直系同源基因分配工具。基于域的直系同源物检测
4. Application-centric, energy-efficient network architecture, ACTION, based on flexible optical network [C] . Yamanaka Naoaki, Okamoto Satoru, Oki Eiji, 2014 12th International Conference on Optical Internet . 2014

机译：以灵活的光网络为基础，以应用为中心的节能网络架构ACTION
5. An efficient design space exploration framework to optimize power-efficient heterogeneous many-core multi-threading embedded processor architectures. [D] . Datta, Kushal. 2011

机译：一个有效的设计空间探索框架，用于优化省电的异构多核多线程嵌入式处理器体系结构。
6. DODO: an efficient orthologous genes assignment tool based on domain architectures. Domain based ortholog detection [O] . Ting-wen Chen, Timothy H Wu, Wailap V Ng, 2010

机译：DODO：基于域架构的高效直系同源基因分配工具。基于域的直系同源物检测
7. Application-Centric, Energy-Efficient Network Architecture ACTION, Based on Virtual Optical Slice Core and Deterministic Optical Access Network [O] . Kunitaka Ashizawa, Satoru Okamoto, Naoaki Yamanaka, 2015

机译：基于虚拟光学切片核心和确定性光学接入网络的应用中心为中心，节能的网络架构动作
8. Towards Automatic Markov Reliability Modeling of Computer Architectures. [R] . liceaga, c. a. siewiorek, d. p. 1986

机译：面向计算机体系结构的自动马尔可夫可靠性建模。

Automatic design of efficient application-centric architectures.

摘要

著录项

相似文献

相关主题

期刊订阅