...
首页> 外文期刊>Microprocessors and microsystems >A resource utilization based instruction fetch policy for SMT processors
【24h】

A resource utilization based instruction fetch policy for SMT processors

机译:SMT处理器的基于资源利用的指令获取策略

获取原文
获取原文并翻译 | 示例
           

摘要

Simultaneous Multithreading (SMT) architectures are proposed to better explore on-chip parallelism, which capture the essence of performance improvement in modern processors. SMT overcomes the limits in a single thread by fetching and executing from multiple of them in a shared fashion. The long-latency operations, however, still cause inefficiency in SMT processors. When instructions have to wait for data from lower-level memory hierarchy, the dependent instructions cannot proceed, hence continue occupying the shared resources on the chip for an extended number of clock cycles. This introduces undesired inter-thread interference in SMT processors, which further leads to negative impacts on overall system throughput and average thread performance. In practice, instruction fetch policies take the responsibility of assigning thread priority at the fetch stage, in an effort to better distribute the shared resources among threads in the same core to cope with the long-latency operations and other runtime behavior from the thread for better performance. In this paper we propose an instruction fetch policy RUCOUNT, which considers resource utilization of individual thread in the prioritization process. The proposed policy observes instructions in the front-end stages of the pipeline as well as low-level data misses to summarize the resource utilization for thread management. Higher priority is granted to the thread(s) with less utilized resources, such that overall resources are distributed more efficiently in SMT processors. As a result, it has two unique features compared to other policies: one is to observe the hardware resource comprehensively and the other is to monitor limited resource entries. Our experimental results demonstrate that RUCOUNT is 20% better than ICOUNT, 10% than Stall, 8% than DG and 3% than DWarn, in terms of averaged performance. Considering its hardware overhead is at the similar level as ICOUNT and DWarn, our proposed instruction fetch policy RUCOUNT is superior among the studied policies.
机译:提出了同时多线程(SMT)架构,以更好地探索片上并行性,从而捕获了现代处理器中性能改进的本质。 SMT通过以共享方式从多个线程中获取并执行来克服单个线程中的限制。但是,长时间等待操作仍然会导致SMT处理器效率低下。当指令必须等待来自较低级存储器层次结构的数据时,从属指令无法继续执行,因此会在扩展的时钟周期数内继续占用芯片上的共享资源。这在SMT处理器中引入了不希望有的线程间干扰,进一步导致了对总体系统吞吐量和平均线程性能的负面影响。在实践中,指令提取策略负责在提取阶段分配线程优先级,以更好地在同一内核中的线程之间分配共享资源,以更好地应对线程的长时延操作和其他运行时行为。性能。在本文中,我们提出了一种指令获取策略RUCOUNT,该策略考虑了优先级排序过程中各个线程的资源利用情况。提出的策略会观察管道前端阶段的指令以及低级别的数据遗漏,以总结线程管理的资源利用率。较高的优先级将授予使用较少资源的线程,以便在SMT处理器中更有效地分配整体资源。因此,与其他策略相比,它具有两个独特的功能:一个是全面观察硬件资源,另一个是监视有限的资源条目。我们的实验结果表明,就平均性能而言,RUCOUNT比ICOUNT好20%,比Stall好10%,比DG好8%,比DWarn好3%。考虑到其硬件开销与ICOUNT和DWarn处于相似的水平,因此我们提出的指令获取策略RUCOUNT在所研究的策略中更为优越。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号