首页> 外文会议>The 25th annual international symposium on computer architecture >Multi-level texture caching for 3D graphics hardware
【24h】

Multi-level texture caching for 3D graphics hardware

机译:用于3D图形硬件的多级纹理缓存

获取原文
获取外文期刊封面目录资料

摘要

Traditional graphics hardware architectures implement what we call the push architecture for texture mapping. Local memory is dedicated to the accelerator for fast local retrieval of texture during rasterization, and the application is responsible for managing this memory. The push architecture has a bandwidth advantage, but disadvantages of limited texture capacity, escalation of accelerator memory requirements (and therefore cost), and poor memory utilization. The push architecture also requires the programmer to solve the bin- packing problem of managing accelerator memory each frame. More recently graphics hardware on PC-class machines has moved to an implementation of what we call the pull architecture. Texture is stored in system memory and downloaded by the accelerator as needed. The pull architecture has advantages of texture capacity, stems the escalation of accelerator memory requirements, and has good memory utilization. It also frees the programmer from accelerator texture memory management. However, the pull architecture suffers escalating requirements for bandwidth from main memory to the accelerator. In this paper we propose multi-level texture caching to provide the accelerator with the bandwidth advantages of the push architecture combined with the capacity advantages of the pull architecture. We have studied the feasibility of 2-level caching and found the following: (1) significant re-use of texture between frames; (2) L2 caching requires significantly less memory than the push architecture; (3) L2 caching requires significantly less bandwidth from host memory than the pull architecture; (4) L2 caching enables implementation of smaller L1 caches that would otherwise bandwidth-limit accelerators on the workloads in this paper. Results suggest that an L2 cache achieves the original advantage of the pull architecture --- stemming the growth of local texture memory --- while at the same time stemming the current explosion in demand for texture bandwidth between host memory and the accelerator.
机译:传统的图形硬件体系结构实现了我们称为纹理映射的 push体系结构。本地内存专用于加速器,用于在光栅化过程中快速本地检索纹理,并且应用程序负责管理此内存。推架构具有带宽优势,但缺点是纹理容量有限,加速器内存需求(因此导致成本)上升以及内存利用率低下。推式架构还要求程序员解决在每个帧中管理加速器存储器的装箱问题。最近,PC级计算机上的图形硬件已转向我们称为 pull体系结构的实现。纹理存储在系统内存中,并根据需要由加速器下载。拉式架构具有纹理容量的优点,阻止了加速器内存需求的增长,并且具有良好的内存利用率。这也使程序员摆脱了加速器纹理内存管理。然而,拉式架构对从主存储器到加速器的带宽的要求不断提高。在本文中,我们提出了多级纹理缓存,以为加速器提供推架构的带宽优势和拉架构的容量优势。我们研究了2级缓存的可行性,并发现以下内容:(1)帧之间纹理的大量重用; (2)与推架构相比,二级缓存所需的内存要少得多; (3)与拉架构相比,二级缓存对主机内存的带宽需求要小得多。 (4)L2缓存支持实现较小的L1缓存,否则将限制带宽限制加速器,减轻本文工作负载的负担。结果表明,L2缓存实现了拉式架构的原始优势-阻止了本地纹理内存的增长-同时阻止了当前主机内存和加速器之间对纹理带宽的需求激增。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号