Predicting inter-thread cache contention on a chip multi-processor architecture

机译：预测芯片多处理器架构的线程间缓存争用

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

This paper studies the impact of L2 cache sharing on threads that simultaneously share the cache, on a chip multi-processor (CMP) architecture. Cache sharing impacts threads nonuniformly, where some threads may be slowed down significantly, while others are not. This may cause severe performance problems such as sub-optimal throughput, cache thrashing, and thread starvation for threads that fail to occupy sufficient cache space to make good progress. Unfortunately, there is no existing model that allows extensive investigation of the impact of cache sharing. To allow such a study, we propose three performance models that predict the impact of cache sharing on co-scheduled threads. The input to our models is the isolated L2 cache stack distance or circular sequence profile of each thread, which can be easily obtained on-line or off-line. The output of the models is the number of extra L2 cache misses for each thread due to cache sharing. The models differ by their complexity and prediction accuracy. We validate the models against a cycle-accurate simulation that implements a dual-core CMP architecture, on fourteen pairs of mostly SPEC benchmarks. The most accurate model, the inductive probability model, achieves an average error of only 3.9%. Finally, to demonstrate the usefulness and practicality of the model, a case study that details the relationship between an application's temporal reuse behavior and its cache sharing impact is presented.

机译：本文研究了L2缓存共享对同时共享缓存的线程的影响，在芯片多处理器（CMP）架构上。缓存共享影响线程不均匀，其中某些线程可能会显着减慢，而另一些线程则不是。这可能会导致严重的性能问题，例如子最优吞吐量，缓存抖动和线程删除的线程，用于线程无法占用足够的缓存空间以取得良好的进展。不幸的是，没有现有的模型，可以广泛调查缓存共享的影响。为了允许这样的研究，我们提出了三种性能模型，其预测高速缓存共享对共同计划的线程的影响。我们模型的输入是隔离的L2高速缓存堆栈距离或每个螺纹的圆形序列轮廓，可以在线或离线容易地获得。模型的输出是由于缓存共享导致每个线程的额外L2缓存未命中的数量。模型因其复杂性和预测精度而异。我们验证模型，以防止实现双核CMP架构的循环准确仿真，大多数规范基准测试。最准确的模型，电感概率模型，实现平均误差仅为3.9％。最后，为了展示模型的有用性和实用性，呈现了详细说明应用程序的时间重用行为与其高速缓存共享影响之间的关系。

著录项

来源
《International Symposium on High-Performance Computer Architecture》|2005年||共12页
会议地点
作者
Dhruba Chandra; Fei Guo; Seongbeom Kim; Yan Solihin;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP303-53;
关键词
multiprocessing systems; microprocessor chips; cache storage; computer architecture; computational complexity; multi-threading; interthread cache; chip multiprocessor architecture; coscheduled thread; circular sequence profile; dual-core CMP architecture; inductive probability model; temporal reuse behavior; L2 cache sharing; nonuniform threading; isolated L2 cache stack distance; circular sequence thread profile; L2 cache misses; computational complexity;

机译：多处理系统;微处理器芯片;缓存存储;计算机架构;计算复杂性;多线程;芯片多处理器架构;COSCHEDULED线程;圆形序列概况;双核CMP架构;归纳概率模型;时间重复使用;L2缓存共享;非均匀螺纹;隔离的L2高速缓存堆距离;圆形序列线程;L2缓存未命中;计算复杂性;

相似文献

外文文献
中文文献
专利

1. SkipCache: application aware cache management for chip multi-processors [J] . Warrier Tripti S., Raghavendra Kanakagiri, Mutyam Madhu Computers & Digital Techniques, IET . 2015,第6期

机译：SkipCache：适用于芯片多处理器的应用程序感知缓存管理
2. A scalable single-chip multi-processor architecture with on-chip RTOS kernel [J] . B. D. Theelen, A. C. Verschueren, V. V. Reyes Suarez, Journal of systems architecture . 2003,第12a15期

机译：具有片上RTOS内核的可扩展单芯片多处理器架构
3. Load Balancing Parallel Hash Join Algorithm Based on Shared Cache Chip Multi-Processor [J] . Yongheng Chen, Wanli Zuo, Man Yuan, Advanced Science Letters . 2012,第2期

机译：基于共享缓存芯片多处理器的负载均衡并行哈希联接算法
4. Predicting inter-thread cache contention on a chip multi-processor architecture [C] . Dhruba Chandra, Fei Guo, Seongbeom Kim, . 2005

机译：预测芯片多处理器体系结构上的线程间缓存争用
5. Enhancing Fairness and Performance on Chip Multi-Processor Platforms with Contention-Aware Scheduling Policies [D] . Marinakis, Theodoros. 2019

机译：通过争用的调度策略提高芯片多处理器平台上的公平性和性能
6. Childhood Hodgkin International Prognostic Score (CHIPS) Predicts event-free survival in Hodgkin Lymphoma: A Report from the Children’s Oncology Group [O] . Cindy L. Schwartz, Lu Chen, Kathleen McCarten, -1

机译：儿童霍奇金国际预后评分（CHIPS）预测霍奇金淋巴瘤的无事件生存期：儿童肿瘤学组的一份报告
7. Predicting Inter-Thread Cache Contention on a Chip Multi-Processor Architecture £ [O] . Dhruba Ch, Fei Guo, Seongbeom Kim, 2008

机译：预测片上多处理器架构上的线程间缓存争用
8. Fast synchronization for shared-memory multiprocessors. Synchronizing caches: Busy-wait locking, waiting, unlocking. Sleep-wait and service-request queuing: Paradigm for high-contention atomic operatiions [R] . Bitar, Philip 1985

机译：共享内存多处理器的快速同步。同步缓存：忙等待锁定，等待，解锁。睡眠等待和服务请求排队：高争用原子操作的范例

Predicting inter-thread cache contention on a chip multi-processor architecture

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅