Multi-level texture caching for 3D graphics hardware

机译：用于3D图形硬件的多级纹理缓存

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Traditional graphics hardware architectures implement what we call the push architecture for texture mapping. Local memory is dedicated to the accelerator for fast local retrieval of texture during rasterization, and the application is responsible for managing this memory. The push architecture has a bandwidth advantage, but disadvantages of limited texture capacity, escalation of accelerator memory requirements (and therefore cost), and poor memory utilization. The push architecture also requires the programmer to solve the bin- packing problem of managing accelerator memory each frame. More recently graphics hardware on PC-class machines has moved to an implementation of what we call the pull architecture. Texture is stored in system memory and downloaded by the accelerator as needed. The pull architecture has advantages of texture capacity, stems the escalation of accelerator memory requirements, and has good memory utilization. It also frees the programmer from accelerator texture memory management. However, the pull architecture suffers escalating requirements for bandwidth from main memory to the accelerator. In this paper we propose multi-level texture caching to provide the accelerator with the bandwidth advantages of the push architecture combined with the capacity advantages of the pull architecture. We have studied the feasibility of 2-level caching and found the following: (1) significant re-use of texture between frames; (2) L2 caching requires significantly less memory than the push architecture; (3) L2 caching requires significantly less bandwidth from host memory than the pull architecture; (4) L2 caching enables implementation of smaller L1 caches that would otherwise bandwidth-limit accelerators on the workloads in this paper. Results suggest that an L2 cache achieves the original advantage of the pull architecture --- stemming the growth of local texture memory --- while at the same time stemming the current explosion in demand for texture bandwidth between host memory and the accelerator.

机译：传统的图形硬件体系结构实现了我们称为纹理映射的 push体系结构。本地内存专用于加速器，用于在光栅化过程中快速本地检索纹理，并且应用程序负责管理此内存。推架构具有带宽优势，但缺点是纹理容量有限，加速器内存需求（因此导致成本）上升以及内存利用率低下。推式架构还要求程序员解决在每个帧中管理加速器存储器的装箱问题。最近，PC级计算机上的图形硬件已转向我们称为 pull体系结构的实现。纹理存储在系统内存中，并根据需要由加速器下载。拉式架构具有纹理容量的优点，阻止了加速器内存需求的增长，并且具有良好的内存利用率。这也使程序员摆脱了加速器纹理内存管理。然而，拉式架构对从主存储器到加速器的带宽的要求不断提高。在本文中，我们提出了多级纹理缓存，以为加速器提供推架构的带宽优势和拉架构的容量优势。我们研究了2级缓存的可行性，并发现以下内容：（1）帧之间纹理的大量重用; （2）与推架构相比，二级缓存所需的内存要少得多; （3）与拉架构相比，二级缓存对主机内存的带宽需求要小得多。（4）L2缓存支持实现较小的L1缓存，否则将限制带宽限制加速器，减轻本文工作负载的负担。结果表明，L2缓存实现了拉式架构的原始优势-阻止了本地纹理内存的增长-同时阻止了当前主机内存和加速器之间对纹理带宽的需求激增。 展开▼

著录项

来源
《The 25th annual international symposium on computer architecture》|1998年|P.86-97|共12页

会议地点

作者
Michael Cox; Narendra Bhandari; Michael Shantz;
展开▼

作者单位

展开▼

会议组织

原文格式 PDF

正文语种

中图分类计算技术、计算机技术;

关键词

引文网络

参考文献

引证文献

共引文献

同被引文献

二级参考文献

二级引证文献

相似文献

外文文献

中文文献

专利

1. DyRT: Dynamic Response Textures for Real Time Deformation Simulation with Graphics Hardware [J] . Doug L. James, Dinesh K. Pai ACM Transactions on Graphics . 2002,第3期

机译：DyRT：使用图形硬件进行实时变形仿真的动态响应纹理

2. Cache-efficient numerical algorithms using graphics hardware [J] . Naga K. Govindaraju, Dinesh Manocha Parallel Computing . 2007,第10a11期

机译：使用图形硬件的高效缓存数值算法

3. A Dual-Shader 3-D Graphics Processor With Fast 4-D Vector Inner Product Units and Power-Aware Texture Cache [J] . Yoon J.-S., Yu C.-H., Kim D., Very Large Scale Integration (VLSI) Systems, IEEE Transactions on . 2011,第4期

机译：具有快速4D矢量内部乘积单元和Power-Aware纹理缓存的双着色3D图形处理器

4. Multi-Level Texture Caching for 3D Graphics Hardware [C] . Michael Cox, Narendra Bhandar, Michael Shantz The 25th annual international symposium on computer architecture . 1998

机译：用于3D图形硬件的多层纹理缓存

5. Minimizing end-to-end interference in I/O stacks spanning shared multi-level buffer caches [D] . Patrick, Christina M. 2011

机译：最小化跨共享多层缓存的I / O堆栈中的端到端干扰

6. Magnetic Resonance Materials in Physics Biology and Medicine Fast Reduction of Undersampling Artifacts in Radial MR Angiography with 3D Total Variation on Graphics Hardware [O] . Florian Knoll, Markus Unger, Clemens Diwoky, -1

机译：物理生物学和医学中的磁共振材料快速减少径向MR血管造影的辐射伪影3D图形硬件总体变化

7. Multi-Level Texture Caching for 3D Graphics Hardware [O] . Michael Cox, Narendra Bhandari, Michael Shantz 1998

机译：用于3D图形硬件的多层纹理缓存

8. Cache group scheme for hardware-controlled cache coherence and the general need for hardware coherence control in large-scale multiprocessors. [R] . Hoag, J. E. 1991

机译：用于硬件控制的高速缓存一致性的高速缓存组方案以及大规模多处理器中硬件一致性控制的一般需求。

1. 3D图形硬件加速纹理映射单元设计 [J] . 向前 ,周珍艮 . 冶金动力 . 2014,第004期

2. 多级纹理细节的立方体全景纹理再现 [J] . 宋颖丽 ,牛保宁 ,宋春花 . 计算机科学与探索 . 2017,第009期

3. 基于虚拟化硬件3D图形加速的渲染云框架 [J] . 王总辉 ,史梳酥 ,陈文智 . 电信科学 . 2012,第010期

4. 3D图形芯片的硬件体系结构 [J] . 董社勤 ,石教英 . 微型计算机 . 1998,第011期

5. 用于二级缓存的一种改进的自适应缓存管理算法 [J] . 孙国忠 ,袁清波 ,陈明宇 . 计算机研究与发展 . 2007,第008期

6. 可配置及历史信息感知的多级缓存策略 [C] . Zu Wenqiang ,祖文强 ,Wang Fang . NCIS2015第21届全国信息存储技术学术会议 . 2015

7. 基于3D渲染的GPU纹理缓存模块缺陷诊断与优化 [A] . 张宽宇 . 2019

1. 用于在3D图形子系统中可编程过滤纹理映射数据的方法和机制 [P] . 中国专利： CN1910621B . 2010.08.04

2. 用于在3D图形子系统中可编程过滤纹理映射数据的方法和机制 [P] . 中国专利： CN1910621A . 2007-02-07

3. Method and apparatus for multi-level demand caching of textures in a graphics display device [P] . 外国专利： US6130680A . 2000-10-10

机译：用于图形显示设备中的纹理的多级需求缓存的方法和设备

4. 3 Cache memory for 3D graphic texture and its cache miss penalty reducing method [P] . 外国专利： KR100291628B1 . 2001-05-15

机译：3用于3D图形纹理的高速缓存及其减少高速缓存未命中损失的方法

5. Steaming prefetching texture cache for level of detail maps in a 3D-graphics engine [P] . 外国专利： US6433789B1 . 2002-08-13

机译：蒸汽预取纹理缓存以在3D图形引擎中获取细节图级别

相关主题

Multi-level texture caching for 3D graphics hardware

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅