首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Visibility Rendering Order: Improving Energy Efficiency on Mobile GPUs through Frame Coherence
【24h】

Visibility Rendering Order: Improving Energy Efficiency on Mobile GPUs through Frame Coherence

机译:可见性渲染顺序:通过帧一致性提高移动GPU的能效

获取原文
获取原文并翻译 | 示例

摘要

During real-time graphics rendering, objects are processed by the GPU in the order they are submitted by the CPU, and occluded surfaces are often processed even though they will end up not being part of the final image, thus wasting precious time and energy. To help discard occluded surfaces, most current GPUs include an Early-Depth test before the fragment processing stage. However, to be effective it requires that opaque objects are processed in a front-to-back order. Depth sorting and other occlusion culling techniques at the object level incur overheads that are only offset for applications having substantial depth and/or fragment shading complexity, which is often not the case in mobile workloads. We propose a novel architectural technique for GPUs, Visibility Rendering Order (VRO), which reorders objects front-to-back entirely in hardware by exploiting the fact that the objects in graphics animated applications tend to keep its relative depth order across consecutive frames (temporal coherence). Since order relationships are already tested by the Depth Test, VRO incurs minimal energy overheads because it just requires adding a small hardware to capture that information and use it later to guide the rendering of the following frame. Moreover, unlike other approaches, this unit works in parallel with the graphics pipeline without any performance overhead. We illustrate the benefits of VRO using various unmodified commercial 3D applications for which VRO achieves 27 percent speed-up and 15.8 percent energy reduction on average over a state-of-the-art mobile GPU.
机译:在实时图形渲染期间,GPU按照CPU提交对象的顺序处理对象,并且即使遮挡的表面最终不属于最终图像,也经常对其进行处理,从而浪费了宝贵的时间和精力。为了帮助丢弃被遮挡的表面,大多数当前的GPU在片段处理阶段之前都进行了Early-Depth测试。但是,要使其有效,就需要以前后顺序处理不透明的对象。在对象级别的深度排序和其他遮挡剔除技术只会产生开销,这些开销仅对于具有相当大的深度和/或片段着色复杂度的应用程序可以抵消,而在移动工作负载中通常不是这种情况。我们为GPU提出了一种新颖的架构技术,即可视化渲染顺序(VRO),该技术通过利用图形动画应用程序中的对象倾向于在连续帧中保持其相对深度顺序的事实来对整个对象进行从头到尾的重新排序(时间连贯性)。由于订单关系已经通过深度测试进行了测试,因此VRO只需极少的硬件就可以捕获该信息,并在以后使用它来指导下一帧的绘制,因此其能源消耗降至最低。此外,与其他方法不同,该单元与图形管线并行工作,而没有任何性能开销。我们说明了使用各种未经修改的商业3D应用程序的VRO的优势,与最先进的移动GPU相比,VRO的平均速度提高了27%,能耗降低了15.8%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号