SAT-based compilation to a non-vonNeumann processor

机译：基于SAT的非vonNeumann处理器编译

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper describes a compilation technique used to accelerate dataflow computations, common in deep neural network computing, onto Coarse Grained Reconfigurable Array (CGRA) architectures. This technique has been demonstrated to automatically compile dataflow programs onto a commercial massively parallel CGRA-based dataflow processor (DPU) containing 16000 processing elements. The DPU architecture overcomes the von Neumann bottleneck by spatially flowing and reusing data from local memories, and provides higher computation efficiency compared to temporal parallel architectures such as GPUs and multi-core CPUs. However, existing software development tools for CGRAs are limited to compiling domain specific programs to processing elements with uniform structures, and are not effective on complex micro architectures where latencies of memory access vary in a nontrivial fashion depending on data locality. A primary contribution of this paper is to provide a general algorithm that can compile general dataflow graphs, and can efficiently utilize processing elements with rich micro-architectural features such as complex instructions, multi-precision data paths, local memories, register files, switches etc. Another contribution is a uniquely innovative application of Boolean Satisfiability to formally solve this complex, and irregular optimization problem and produce high-quality results comparable to hand-written assembly code produced by human experts. A third contribution is an adaptive windowing algorithm that harnesses the complexity of the SAT-based approach and delivers a scalable and robust solution.

机译：本文介绍了一种用于加速深度神经网络计算中常见的数据流计算的编译技术，该技术适用于粗粒度可重配置阵列（CGRA）架构。已经证明该技术可以将数据流程序自动编译到包含16000个处理元件的商用大规模并行基于CGRA的数据流处理器（DPU）上。与GPU和多核CPU等时间并行架构相比，DPU架构通过空间流动和重用本地内存中的数据来克服冯·诺依曼瓶颈，并提供更高的计算效率。但是，现有的CGRA软件开发工具仅限于编译特定领域的程序以处理具有统一结构的元素，并且在复杂的微体系结构上无效，而在这些微体系结构中，内存访问的延迟会根据数据的局部性以不平凡的方式变化。本文的主要贡献是提供一种通用算法，该算法可以编译通用数据流图，并可以有效利用具有丰富微体系结构特征的处理元素，例如复杂指令，多精度数据路径，本地存储器，寄存器文件，开关等。。另一个贡献是布尔可满足性的独特创新应用，可以正式解决这个复杂的，不规则的优化问题，并产生与人类专家手写的汇编代码相媲美的高质量结果。第三个贡献是自适应窗口算法，该算法利用了基于SAT的方法的复杂性，并提供了可扩展且强大的解决方案。

著录项

来源
《IEEE/ACM International Conference on Computer-Aided Design》|2017年|675-682|共8页
会议地点
作者
Samit Chaudhuri; Asmus Hetzel;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Registers; Routing; Neural networks; Computational efficiency; Hardware; Memory management;

机译：寄存器;路由;神经网络;计算效率;硬件;内存管理;

相似文献

外文文献
中文文献
专利

1. High pressure processing of foods - A brief compilation of relevant processing aspects [German] [J] . Heinz V Fleischwirtschaft . 2003,第4期

机译：食品的高压加工-有关加工方面的简要汇编[德语]
2. SATMC: a SAT-based model checker for security protocols, business processes, and security APIs [J] . Alessandro Armando, Roberto Carbone, Luca Compagna International Journal on Software Tools for Technology Transfer . 2016,第2期

机译：SATMC：基于SAT的模型检查器，用于安全协议，业务流程和安全API
3. COMPLETE SAT-BASED MODEL CHECKING FOR CONTEXT-FREE PROCESSES [J] . GENG-DIAN HUANG, BOW-YAW WANG International Journal of Foundations of Computer Science . 2010,第2期

机译：完全基于SAT的无上下文过程模型检查
4. SAT-based compilation to a non-vonNeumann processor [C] . Samit Chaudhuri, Asmus Hetzel IEEE/ACM International Conference on Computer-Aided Design . 2017

机译：基于SAT的汇编到非vonneumann处理器
5. A Compilation of Digital Audio Signal Processing Techniques and Implementations in Real-time along with a Modified and Weighted Convolution Effect [D] . Wilson, Daniel P. 2018

机译：实时数字音频信号处理技术的汇编以及改进的加权卷积效应
6. Efficient stochastic simulation of reaction–diffusion processes via direct compilation [O] . Mieszko Lis, Maxim N. Artyomov, Srinivas Devadas, -1

机译：通过直接编译进行反应扩散过程的高效随机模拟
7. Compilation of processing factors and evaluation of quality controlled data of food processing studies [O] . Rebekka Scholz, Michael Herrmann, Britta Michalski 2016

机译：加工因子的汇编和食品加工研究质量控制数据的评估

SAT-based compilation to a non-vonNeumann processor

摘要

著录项

相似文献

相关主题

期刊订阅