Exploring multimedia applications locality to improve cache performance

机译：探索多媒体应用程序的本地性以提高缓存性能

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This research aims to explore possible solutions to improvement of performance in multimedia processor [1]. In this context, cache memory performance plays a more and more critical role in computer systems, since the gap between processor speed and main memory speed tends to increase rather than the contrary. The integration inside the computational units of some SIMD improvements (such as Pentium MMX, HP MAX2 or UltraSparc VIS) for improving the parallel computation on image pixels is the main answer to the heavy workloads of multimedia applications [2]. Moreover, the workload of multimedia applications [3] has a strong impact on cache memory performance, since the locality of memory references embedded in multimedia programs differs from that of traditional programs. In fact, as widely known, programs exhibit two main kind of locality: spatial and temporal. Nevertheless, as stated in [1], multimedia applications seem to present a new kind of locality, called 2D-spatial locality(i.e. there is an high probability that accessing to an address, future accesses will be in a bidimensional neighborhood of it). For this reason, standard cache memory organization achieves poorer performance when used for multimedia. To achieve an overall performance improvement on specialized multimedia processors, further architectural modification on memory hierarchy and on its management should be fulfilled. This could be coupled with the recent idea of associating programmable components with memory separated from the main processor, such as IRAM [4].

First goal of this research is to prove that common multimedia applications exhibit a 2D-spatial locality. To do this, we developed a benchmark including the most common multimedia and image processing applications. Many trace-driven simulations confirm the hypothesis [5][6].

After this, we try to explore techniques able to exploit this locality to improve cache performance. Among the various techniques used to improve cache memory performance, prefetching has been one of the most studied and apparently promising (see [7][8], where, however, no assumption on 2D spatial locality is highlighted). Prefetching techniques can be mainly classified according to their potential software or hardware implementation, although some techniques may take advantage of a combined software/hardware implementation [9]. A widely explored approach to improve cache performance is hardware prefetching that allows the pre-loading of data in the cache before they are referenced. However, existing hardware prefetching approaches partially miss the potential performance improvement, since they are not tailored to multimedia locality. In this research we are proposing novel effective approaches to hardware prefetching to be used in image processing programs for multimedia. In particular, we have addressed multimedia image processing, where we have included algorithms like the widespread MPEG-2 decoding used for decompression of audio/video streams and typical image processing operations like convolution for image filtering and edge chain coding, used as a pre-processing step in many image analysis tasks. We have omitted evaluation on sound data (like MP3 decompression or speech recognition), since they exhibit typical array spatial locality and standard prefetching techniques perform well enough. Algorithms have been selected according to their spread and their different data addressing schemes: while convolution is dominated by a regular data addressing scheme which can be predicted a priori, edge chain coding is heavily data dependent, in the sense that the address sequence of data references depends on the image and cannot be statically predicted: for example, in this case software prefetching techniques (based on compile-time prediction of future accesses) are not suitable. MPEG-2 exhibits a combination of regular address scheme and data dependency.

Typical hardware prefetching techniques are not suitable in this context: techniques based on one-block-lookahead [10] exploit only 1D spatial locality, while adaptive techniques do not match data dependency of some image processing algorithms.

机译：

本研究旨在探索可能的解决方案，以提高多媒体处理器的性能 [1]。在这种情况下，高速缓存存储器的性能在计算机系统中起着越来越重要的作用，因为处理器速度和主存储器速度之间的差距趋于增加而不是相反。一些SIMD改进（例如Pentium MMX，HP MAX2或UltraSparc VIS）的计算单元内部的集成，用于改进图像像素的并行计算，这是多媒体应用程序繁重工作量的主要解决方案[2]。此外，多媒体应用程序的工作量[3]对高速缓存的性能有很大的影响，因为嵌入在多媒体程序中的内存引用的位置不同于传统程序。实际上，众所周知，程序表现出两种主要的局部性：空间性和时间性。但是，如[1]中所述，多媒体应用程序似乎呈现了一种新的位置，称为 2D空间位置（即很有可能访问地址，将来的访问将在它的二维邻域）。因此，标准的高速缓存存储器组织在用于多媒体时会获得较差的性能。为了在专用多媒体处理器上实现整体性能的提高，应该对内存层次结构及其管理进行进一步的架构修改。这可能与最近的想法相联系，即将可编程组件与与主处理器分离的内存相关联，例如IRAM [4]。

该研究的首要目标是证明常见的多媒体应用程序具有2D空间局部性。为此，我们制定了基准测试，其中包括最常见的多媒体和图像处理应用程序。许多跟踪驱动的仿真证实了这一假设[5] [6]。

此后，我们尝试探索能够利用此局部性来提高缓存性能的技术。在用于提高高速缓存存储器性能的各种技术中，预取一直是研究最多的技术之一，并且显然是有前途的（请参见[7] [8]，其中未突出显示关于2D空间局部性的假设）。预取技术主要可以根据其潜在的软件或硬件实现方式进行分类，尽管某些技术可以利用组合的软件/硬件实现方式[9]。改善硬件性能的一种广泛探索的方法是硬件预取，它允许在引用数据之前将数据预加载到高速缓存中。但是，现有的硬件预取方法部分地未实现潜在的性能改进，因为它们并非针对多媒体本地定制的。在这项研究中，我们提出了一种新的有效的硬件预取方法，以用于多媒体图像处理程序。特别是，我们已经解决了多媒体图像处理的问题，其中包括诸如用于音频/视频流解压缩的广泛的MPEG-2解码之类的算法，以及作为图像过滤和边缘链编码的卷积之类的典型图像处理操作（作为预许多图像分析任务中的处理步骤。我们已经省略了对声音数据的评估（例如MP3解压缩或语音识别），因为它们表现出典型的数组空间局部性，并且标准的预取技术表现良好。已经根据算法的扩展和不同的数据寻址方案选择了算法：虽然卷积由可以先验地预测的常规数据寻址方案主导，但从某种意义上说，边缘链编码在很大程度上取决于数据数据引用的地址序列取决于图像并且不能静态预测：例如，在这种情况下，软件预取技术（基于对将来访问的编译时预测）不适合。 MPEG-2结合了常规地址方案和数据依赖性。

典型的硬件预取技术不适用于这种情况：基于单块超前[10]的技术仅利用1D空间局部性，而自适应技术则不匹配某些图像处理算法的数据依赖性。展开▼

著录项

来源
《ACM international conference on Multimedia》|2000年|P.509-510|共2页

会议地点

作者
Andrea Prati;
展开▼

作者单位

展开▼

会议组织

原文格式 PDF

正文语种

中图分类计算技术、计算机技术;

关键词

相似文献

外文文献

中文文献

专利

1. A low energy cache design for multimedia applications exploiting set access locality [J] . Yang J, Yu J, Zhang YT Journal of systems architecture . 2005,第10a11期

机译：利用集访问局部性的多媒体应用程序的低能耗缓存设计

2. Line Sharing Cache: Exploring Cache Capacity with Frequent Line Value Locality [J] . Keitarou OKA, Hiroshi SASAKI, Koji INOUE 電子情報通信学会技術研究報告 . 2013,第451期

机译：线路共享缓存：使用频繁的线路值局部性来探索缓存容量

3. Power and performance analysis of multimedia applications running on low-power devices by cache modeling [J] . Abu Asaduzzaman, Govipalagodage H. Gunasekara Multimedia Tools and Applications . 2014,第1期

机译：通过缓存建模对运行在低功耗设备上的多媒体应用程序进行功耗和性能分析

4. Exploring Multimedia Applications Locality to Improve Cache Performance [C] . Adrea Prati ACM international conference on multimedia . 2000

机译：探索多媒体应用程序的本地性以提高缓存性能

5. Improving cache locality for thread-level speculation systems. [D] . Fung, Stanley Lap Chiu. 2005

机译：改善线程级推测系统的缓存局部性。

6. Paper trails trailing behind: improving informed consent to IVF through multimedia applications [O] . Jody Lyneé Madeira, Barbara Andraka-Christou 2016

机译：论文追踪追踪追踪：通过多媒体应用程序改善对IVF的知情同意

7. Exploration of the Spatial Locality on Emerging Applications and the Consequences for Cache Performance [O] . Martin Kämpe, Fredrik Dahlgren 2000

机译：新兴应用程序的空间局部性和高速缓存性能后果的探索

1. ASP.NET中利用缓存机制提高web应用程序性能的研究 [J] . 刘雷 ,宫丽华 . 泰山学院学报 . 2005,第006期

2. 使用ASP.NET缓存功能创建高性能WEB应用程序 [J] . 尹战文 ,谭芙蓉 . 金融科技时代 . 2006,第004期

3. 使用缓存技术提高WEB应用程序的效率 [J] . 彭利云 . 萍乡高等专科学校学报 . 2005,第004期

4. 驱动程序级缓存:提高外存性能的新缓存 [J] . 刘军 ,杨学军 ,唐玉华 . 计算机工程 . 2004,第015期

5. 一种提高无缓存片上网络性能的方法 [J] . 张坤 ,刘怡俊 . 广东工业大学学报 . 2017,第004期

6. 提高电气工程及其自动化本科专业课教学实践性的探索——以《电力系统多媒体展示》软件辅助教学 [C] . 刘刚 ,张尧 . 第四届全国高等学校电气工程及其自动化专业教学改革研讨会 . 2007

7. 提高纳米Fe2O3可见光催化性能的策略探索及其机制洞察 [A] . 孟庆强 . 2014

1. 一种通过缓存提高多媒体消息中心业务处理性能的方法 [P] . 中国专利： CN100359891C . 2008.01.02

2. 一种通过缓存提高多媒体消息中心业务处理性能的方法 [P] . 中国专利： CN1716917A . 2006-01-04

3. Methods and devices for improving the performance of web browser applications, computer program products for improving the performance of web browser applications, devices for improving the performance of client / server systems [P] . 外国专利： KR19980703861A . 1998-12-05

机译：用于改善网络浏览器应用程序性能的方法和设备，用于改善网络浏览器应用程序性能的计算机程序产品，用于改善客户端/服务器系统性能的设备

4. Request cache to improve web applications performance [P] . 外国专利： US10594764B2 . 2020-03-17

机译：请求缓存以提高Web应用程序的性能

5. REQUEST CACHE TO IMPROVE WEB APPLICATIONS PERFORMANCE [P] . 外国专利： US2018084075A1 . 2018-03-22

机译：要求提高Web应用程序性能

相关主题

Exploring multimedia applications locality to improve cache performance

摘要

著录项

相似文献

相关主题

期刊订阅