Horizontally microprogrammable CPUs belong to a class of machines having statically schedulable parallel instruction execution (SPIE machines). Several experiments have shown that within basic blocks, real code only gives a potential speed-up factor of 2 or 3 when compacted for SPIE machines, even in the presence of unlimited hardware. In this paper, similar experiments are described. However, these measure the potential parallelism available using any global compaction method, that is, one which compacts code beyond block boundaries. Global compaction is a subject of current investigation; no measurements yet exist on implemented systems.
The approach taken is to first assume that an oracle is available during compaction. This oracle can resolve all dynamic considerations in advance, giving us the ability to find the maximum parallelism available without reformulation of the algorithm. The parallelism found is constrained only by legitimate data dependencies, since questions of conditional jump directions and unresolved indirect memory references are answered by the oracle. Using such an oracle, we find that typical scientific programs may be sped up by anywhere from 3 to 1000 times. These dramatic results provide an upper bound for global compaction techniques. We describe experiments in progress which attempt to limit the oracle progressively, with the aim of eventually producing one which provides only information that may be obtained by a very good compiler. This will give us a more practical measure of the parallelism potentially obtainable via global compaction methods.
所采用的方法是首先假设在压缩期间有一个oracle。该预言机可以提前解决所有动态问题,从而使我们能够找到可用的最大并行度,而无需重新编写算法。发现的并行性仅受合法数据依赖项的约束,因为条件跳转方向和未解决的间接内存引用的问题由oracle回答。使用这种预言机,我们发现典型的科学程序可能会加速3到1000倍。这些惊人的结果为整体压实技术提供了一个上限。我们描述了正在进行的实验,这些实验试图逐步限制预言机,目的是最终产生仅提供可以由非常好的编译器获得的信息的预言机。这将为我们提供一种更实用的方法,用于通过全局压缩方法可能获得的并行性。 P>
机译:协调程序的水平并行和垂直指令打包以提高系统整体效率
机译:在流程序中实现自我感知的并行性
机译:在消息驱动的流程序中利用可控粒度并行性
机译:单指令流并行度大于2
机译:为多指令流架构开发多粒度并行性。
机译:听觉流分离和人工耳蜗听众的选择性注意:行为措施和事件相关的电位的证据。
机译:单指令流并行性大于2
机译:用于在sImD(单指令流多数据流)多处理器上模拟矩阵运算的仿真工具