首页> 外文学位 >GPU and FPGA Coprocessors for Data Intensive Computations.
【24h】

GPU and FPGA Coprocessors for Data Intensive Computations.

机译:用于数据密集型计算的GPU和FPGA协处理器。

获取原文
获取原文并翻译 | 示例

摘要

With the current norm of multi-core processors, stagnant clock rates, and slowing gains from instruction level parallelism, it has become increasingly important to exploit parallelism in order to achieve acceptable performance for data intensive tasks. While multi-core processors are fine for exploiting thread-level parallelism, they are often a suboptimal choice for problems that exhibit abundant data parallelism. This thesis investigates the application of Graphics Processing Units (GPUs) and Field Programmable Gate Array (FPGA) coprocessors for data intensive, data parallel workloads.;Since adopting a unified shader architecture and a general programming model, GPUs have become an increasingly important alternative to general-purpose processors for compute intensive applications, since they feature peak floating-point performance well above that of general-purpose processors. We investigate GPU coprocessors for a simple particle simulation and demonstrate the performance benefit of offloading spatial transformations and basic particle motion calculations to a GPU. We also study a GPU coprocessor for the k-Means clustering algorithm and demonstrate application speedups of 40-70x.;FPGAs are hardware devices capable of implementing arbitrary digital circuits. The vast internal bandwidth and low power consumption afforded by these devices makes them an attractive target for certain data parallel workloads. We investigate FPGA architecture for Decision Tree Classification that can achieve a speedup of 30x for the split determination phase of the algorithm. We also present a fast pairwise statistical significance estimation architecture using an FPGA coprocessor that offloads the alignment task to an accelerator designed to concurrently process multiple independent alignments, resulting in an end-to-end speedup of over 200x over a baseline software implementation.
机译:随着当前多核处理器规范的发展,时钟速率的停滞以及指令级并行性的缓慢增长,利用并行性以实现数据密集型任务的可接受性能变得越来越重要。尽管多核处理器可以很好地利用线程级并行性,但对于表现出大量数据并行性的问题,它们通常不是最佳选择。本文研究了图形处理单元(GPU)和现场可编程门阵列(FPGA)协处理器在数据密集型,数据并行工作负载中的应用。由于采用统一的着色器体系结构和通用的编程模型,GPU成为了越来越重要的替代产品用于计算密集型应用程序的通用处理器,因为它们的峰值浮点性能远高于通用处理器。我们研究了用于简单粒子模拟的GPU协处理器,并演示了将空间变换和基本粒子运动计算卸载到GPU上的性能优势。我们还研究了用于k-Means聚类算法的GPU协处理器,并演示了40-70x的应用加速。FPGA是能够实现任意数字电路的硬件设备。这些设备提供的巨大内部带宽和低功耗使其成为某些数据并行工作负载的有吸引力的目标。我们研究了用于决策树分类的FPGA体系结构,该体系结构在算法的拆分确定阶段可以实现30倍的加速。我们还提出了一种使用FPGA协处理器的快速成对统计显着性估计架构,该协处理器将比对任务分流到设计用于同时处理多个独立比对的加速器中,从而使端到端的速度比基线软件实现高200倍以上。

著录项

  • 作者

    Honbo, Daniel.;

  • 作者单位

    Northwestern University.;

  • 授予单位 Northwestern University.;
  • 学科 Computer engineering.
  • 学位 Ph.D.
  • 年度 2014
  • 页码 64 p.
  • 总页数 64
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号