Understanding why correlation profiling improves the predictability of data cache misses in nonnumeric applications

Mowry T.C.; Luk C.-K.

首页> 外文期刊>IEEE Transactions on Computers >Understanding why correlation profiling improves the predictability of data cache misses in nonnumeric applications

【24h】

Understanding why correlation profiling improves the predictability of data cache misses in nonnumeric applications

机译：了解为什么关联分析可以提高非数值应用程序中数据高速缓存未命中的可预测性

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Latency-tolerance techniques offer the potential for bridging the ever-increasing speed gap between the memory subsystem and today's high-performance processors. However, to fully exploit the benefit of these techniques, one must be careful to apply them only to the dynamic references that are likely to suffer cache misses-otherwise the runtime overheads can potentially offset any gains. In this paper, we focus on isolating dynamic miss instances in nonnumeric applications, which is a difficult but important problem. Although compilers cannot statically analyze data locality in nonnumeric applications, one viable approach is to use profiling information to measure the actual miss behavior. Unfortunately, the state-of-the-art in cache miss profiling (which we call summary profiling) is inadequate for references with intermediate miss ratios-it either misses opportunities to hide latency, or else inserts overhead that is unnecessary. To overcome this problem, we propose and evaluate a new profiling technique that helps predict which dynamic instances of a static memory reference will hit or miss in the cache: correlation profiling Our experimental results demonstrate that roughly half of the 21 nonnumeric applications we study can potentially enjoy significant reductions in memory stall time by exploiting at least one of the three forms of correlation profiling we consider: control-flow correlation, self correlation, and global correlation. In addition, our detailed case studies illustrate that self correlation succeeds because a given reference's cache outcomes often contain repeated patterns and control-flow correlation succeeds because cache outcomes are often call-chain dependent. Finally, we suggest a number of ways to exploit correlation profiling in practice and demonstrate that software prefetching can achieve better performance on a modern superscalar processor when directed by correlation profiling rather than summary profiling information.

机译：延迟容忍技术为弥合内存子系统与当今高性能处理器之间不断扩大的速度差距提供了潜力。但是，要充分利用这些技术的优势，必须小心将它们仅应用于可能遭受高速缓存未命中的动态引用，否则运行时开销可能会抵消任何收益。在本文中，我们专注于隔离非数值应用程序中的动态缺失实例，这是一个困难但重要的问题。尽管编译器无法静态分析非数值应用程序中的数据局部性，但一种可行的方法是使用性能分析信息来衡量实际的未命中行为。不幸的是，最新的高速缓存未命中概要分析（我们称为摘要概要分析）不足以支持具有中等未命中率的引用，它会丢失隐藏延迟的机会，或者会插入不必要的开销。为了克服这个问题，我们提出并评估了一种新的分析技术，该技术可以帮助预测静态内存引用的哪些动态实例将在高速缓存中命中或丢失：相关性分析我们的实验结果表明，我们研究的21种非数字应用程序中大约有一半可以潜在地通过利用我们考虑的三种相关配置文件中的至少一种，可以显着减少内存停顿时间：控制流相关，自相关和全局相关。此外，我们的详细案例研究表明，自相关成功是因为给定引用的缓存结果通常包含重复的模式，而控制流相关成功是因为缓存结果通常依赖于调用链。最后，我们提出了许多在实践中利用相关概要分析的方法，并证明了在由相关概要分析而不是摘要概要分析信息指导的情况下，软件预取可以在现代超标量处理器上实现更好的性能。

著录项

来源
《IEEE Transactions on Computers》 |2000年第4期|P.369-384|共16页
作者
Mowry T.C.; Luk C.-K.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. A high data-driven understanding: it's just a matter of time before DSS fusion -- the complementary use of OLAP, data mining, and visualization -- extends to nonnumeric data [J] . Erik Thomsen Database Programming and Design . 1999,第2期

机译：对数据驱动的高度理解：DSS融合（OLAP，数据挖掘和可视化的补充使用）扩展到非数值数据只是时间问题
2. Improving Data Cache Performance with Integrated Use of Split Caches, Victim Cache and Stream Buffers [J] . Afrin Naz, Mehran Rezaei, Krishna Kavi, Computer architecture news . 2005,第3期

机译：通过结合使用拆分缓存，受害者缓存和流缓冲区来提高数据缓存性能
3. Exploiting spatial-temporal correlations to improve energy-efficiency in data collection applications in WSN [J] . Oualid Demigha, Slimane Bedda, Mossab Chabane International journal of communication networks and distributed systems . 2019,第2期

机译：利用时空相关性来提高WSN中数据收集应用程序的能效
4. Predicting data cache misses in non-numeric applications through correlation profiling [C] . Todd C. Mowry, Chi-Keung Luk Annual ACM/IEEE international symposium on Microarchitecture;ACM/IEEE international symposium on Microarchitecture . 1997

机译：通过关联分析预测非数值应用程序中的数据缓存未命中
5. Application of data mining techniques to batch profiles for process understanding and improvement. [D] . Johnson, Mark Steven. 2001

机译：将数据挖掘技术应用于批处理概要文件，以了解和改进过程。
6. Near misses in a cataract theatre: how do we improve understanding and documentation? [O] . K Mandal, W Adams, S Fraser 2005

机译：白内障剧院中的近乎失误：我们如何改善理解和记录？
7. Predicting Data Cache Misses in Non-Numeric Applications Through Correlation Profiling [O] . Todd C. Mowry, Chi-Keung Luk 1997

机译：通过关联分析预测非数值应用程序中的数据高速缓存未命中

Understanding why correlation profiling improves the predictability of data cache misses in nonnumeric applications

摘要

著录项

相似文献

相关主题

期刊订阅