首页> 外文期刊>Multimedia Tools and Applications >Demand look-ahead memory access scheduling for 3D graphics processing units
【24h】

Demand look-ahead memory access scheduling for 3D graphics processing units

机译:3D图形处理单元的需求预见存储器访问调度

获取原文
获取原文并翻译 | 示例
           

摘要

With the rapid growing complexity of 3D applications, the memory subsystem has become the most bandwidth-exhausting bottleneck in a Graphics Processing Unit (GPU). To produce realistic images, tens to hundreds of thousands of primitives are used. Furthermore, each primitive generates thousands of pixels, and these pixels are computed by shaders with special effects, even to blend multiple texture pixels from external memory to obtain a final color. To hide the long latency texture operations, the shaders are usually highly multithreaded to increase its throughput. However, conventional memory scheduling mechanisms are unaware of the producer-consumer relationship between primitives and pixels. The conventional scheduling mechanisms neither assume that all initiators are independent nor that they use a fixed priority scheme. This paper proposes Demand Look-Ahead (DLA) memory access scheduling based on the statuses of each unit in the GPU, and dynamically generates priority for the memory request scheduler. By considering the producer-consumer relationship, the proposed mechanism reschedules most urgent requests to be serviced first. Experimental results show that the proposed DLA improves 1.47 % and 1.44 % in FPS and IPC, respectively, than First-Ready First-Come-First-Serve (FR-FCFS). By integrating DLA with Bank-level Parallelism Awareness (BPA), DLA-BPA improves FPS and IPC by 7.28 % and 6.55 %, respectively. Furthermore, shader thread performance is improved by 22.06 % and increases the attainable bandwidth by 5.91 % with DLA-BPA.
机译:随着3D应用程序复杂性的快速增长,内存子系统已成为图形处理单元(GPU)中最耗费带宽的瓶颈。为了产生逼真的图像,使用了成千上万的图元。此外,每个图元都会生成数千个像素,并且这些像素是由具有特殊效果的着色器计算的,甚至可以混合来自外部存储器的多个纹理像素以获得最终的颜色。为了隐藏长时间等待的纹理操作,着色器通常是高度多线程的,以增加其吞吐量。但是,常规的存储器调度机制并不了解图元和像素之间的生产者-消费者关系。传统的调度机制既不假定所有发起方都是独立的,也不假定它们使用固定优先级方案。本文基于GPU中每个单元的状态,提出了需求提前查询(DLA)内存访问调度,并动态生成了内存请求调度程序的优先级。通过考虑生产者与消费者之间的关系,提议的机制重新安排了最紧急的请求,使其首先得到服务。实验结果表明,与首先准备就绪的先来先服务(FR-FCFS)相比,所提出的DLA在FPS和IPC上分别提高了1.47%和1.44%。通过将DLA与银行级并行意识(BPA)集成,DLA-BPA分别将FPS和IPC提高了7.28%和6.55%。此外,使用DLA-BPA,着色器线程性能提高了22.06%,可达到的带宽增加了5.91%。

著录项

  • 来源
    《Multimedia Tools and Applications》 |2014年第3期|1391-1416|共26页
  • 作者单位

    Department of Information and Computer Engineering, Chung Yuan Christian University, 200, Chung Pei Rd., Chung Li 32023, Taiwan;

    Department of Information and Computer Engineering, Chung Yuan Christian University, 200, Chung Pei Rd., Chung Li 32023, Taiwan;

    Department of Information and Computer Engineering, Chung Yuan Christian University, 200, Chung Pei Rd., Chung Li 32023, Taiwan;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Demand look-ahead; GPU; Graphics rendering; Memory access scheduling;

    机译:需求提前GPU;图形渲染;内存访问调度;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号