Squeezing more CPU performance out of a Cray-2 by vector blockscheduling

机译：通过矢量块从Cray-2压缩更多的CPU性能排程

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Compile-time scheduling of vector activities on the Cray 2 isstudied using a simplified model of the vector instruction stream. Anapproach based on experience with an array-processor microde schedulingby the authors is shown to be practical. It calls for a pass of loopscheduling followed by a pass of resource allocation. Actual benchmarksof the resulting code are shown, exhibiting speedups as large as 50%over the current CFT77 compiler. The results also give a novelperspective on vector chaining vs. nonchaining processor architectures

机译：Cray 2上的矢量活动的编译时调度为使用矢量指令流的简化模型进行了研究。一个经验的阵列处理器微调度方法由作者证明是可行的。它要求循环通过调度，然后传递资源分配。实际基准显示了生成的代码，显示出高达50％的加速在当前的CFT77编译器上。结果也给小说向量链与非链处理器架构的观点

著录项

来源
《Supercomputing '88. Vol.1. Proceedings.》||p.237-245|共9页
会议地点
作者
Eisenbeis C.; Jalby W.; Lichnewsky A.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
入库时间 2022-08-26 14:37:10

相似文献

外文文献
中文文献
专利

1. Cache simulation for irregular memory traffic on multi-core CPUs: Case study on performance models for sparse matrix-vector multiplication [J] . James D. Trotter, Johannes Langguth, Xing Cai Journal of Parallel and Distributed Computing . 2020,第Octa期

机译：多核CPU上不规则内存流量的缓存仿真：稀疏矩阵乘法性能模型的案例研究
2. High-performance computing of 1/root x(i) and exp(+/- x(i)) for a vector of inputs x(i) on Alpha and IA-64 CPUs [J] . Sharif MH, Basermann A, Seidel C, Journal of systems architecture . 2008,第7期

机译：在Alpha和IA-64 CPU上针对输入x（i）的矢量对1 /根x（i）和exp（+/- x（i））进行高性能计算
3. Importance of explicit vectorization for CPU and GPU software performance [J] . Dickson N.G., Karimi K., Hamze F. Journal of Computational Physics . 2011,第13期

机译：显式矢量化对CPU和GPU软件性能的重要性
4. Squeezing more CPU performance out of a Cray-2 by vector block scheduling [C] . Eisenbeis, C., Jalby, . 1988

机译：通过向量块调度从Cray-2中挤出更多的CPU性能
5. Optimized Parallel Training of Word Vectors on Multi-Core CPU and GPU [D] . Simonton, Trevor McDonald. 2017

机译：多核CPU和GPU上的单词矢量优化并行培训
6. Performance data of multiple-precision scalar and vector BLAS operations on CPU and GPU [O] . Konstantin Isupov 2020

机译：CPU和GPU上的多精度标量和矢量BLAS操作的性能数据
7. Importance of Explicit Vectorization for CPU and GPU Software Performance [O] . Dickson, Neil G., Karimi, Kamran, Hamze, Firas 2010

机译：CpU和GpU软件显式矢量化的重要性性能

Squeezing more CPU performance out of a Cray-2 by vector blockscheduling

摘要

著录项

相似文献

相关主题

期刊订阅