Currently, almost all parallel implementations of programs fix the granularity at which parallelism is exploited at design time. Depending on the application structure and the parallel hardware structure, the programmer decides to exploit parallelism at a fine granularity or coarse granularity or some intermediate granularity, but this granularity is not changed at runtime. In this paper we argue that for many applications fixing the granularity in advance is not a good strategy. Instead it is advantageous to decide the granularity at which parallelism is exploited at runtime, as a function of the available hardware resources and as a function of the overheads associated with going to a finer granularity. We present experimental results from a parallel implementation of a geometric constraint satisfaction system to support our thesis. Our results show a significant advantage in using adaptive parallelism.
当前,几乎所有程序的并行实现都固定了在设计时利用并行性的粒度。取决于应用程序结构和并行硬件结构,程序员决定以细粒度或粗粒度或某种中间粒度来利用并行性,但此粒度在运行时不会更改。在本文中,我们认为对于许多应用程序而言,预先确定粒度不是一个好的策略。取而代之的是,根据可用的硬件资源以及与获得更精细的粒度有关的开销,决定在运行时利用并行性的粒度。我们提出了并行执行几何约束满足系统的实验结果,以支持我们的论文。我们的结果显示了使用自适应并行性的显着优势。 P>
Department of Computer Science, Stanford University, CA;
机译:通过分层细粒度并行性挖掘深层神经网络的潜力
机译:在粗粒度可重构体系结构上利用不完美嵌套循环的并行性
机译:通过操作重新排序在H.264解块滤波器中利用细粒度并行性
机译:使用运行时信息自动利用交叉调用并行性
机译:为多指令流架构开发多粒度并行性。
机译:利用多级平行度拼接非常大的显微镜图像
机译:使用运行时信息自动利用交叉调用并行