We report on our work in developing a fine-grained multithreaded solution for the communication-intensive Conjugate Gradient (CG) problem. In our recent work, we have developed a simple, yet very efficient, solution to executing matrix-vector multiply on a multithreaded system. This paper presents an effective mechanism for the reduction-broadcast phase, which is implemented and integrated with the sparse MVM resulting in a scalable implementation of the complete CG application.
Three major observations from our experiments on the EARTH multithreaded testbed are: (1) The scalability of our CG implementation is impressive, e.g., speedup is 90 on 120 processors for the NAS CG class B input. (2) Our dataflow-style reduction-broadcast network based on fine-grain multithreading is twice as fast as a serial reduction scheme on the same system. (3)By slowing down the netwok by a factor of 2, no notable degradation of overall CG performance was observed.
我们报告了我们在开发用于通信密集型共轭梯度(CG)问题的细粒度多线程解决方案方面的工作。在最近的工作中,我们已经开发了一种简单但非常有效的解决方案,可以在多线程系统上执行矩阵向量乘法。本文提出了一种减少广播阶段的有效机制,该机制已实现并与稀疏MVM集成在一起,从而实现了完整CG应用程序的可扩展实现。 P>
我们在EARTH多线程测试平台上进行的实验得出的三个主要结论是:(1)我们CG实现的可伸缩性令人印象深刻,例如,NAS CG B类输入的120个处理器的加速比达到了90。 (2)我们基于细粒度多线程的数据流式缩减广播网络的速度是同一系统上串行缩减方案的两倍。 (3)通过将网络速度降低2倍,未发现整体CG性能显着下降。 P>
机译:从超轻型飞机的模拟紧急着陆评估心电图和呼吸记录
机译:BCGS座谈会:思想聚集在加拿大进行稀土研究
机译:2005年克什米尔地震场景中Virtual Globe Clients的用户验证google Earth和Arcgis Explorer
机译:将CG降落到地球上:以进化路径上的细粒度多线程为例
机译:“整个地球都是一个村庄”:对马歇尔·麦克卢汉(Marshall McLuhan)的“全球村庄”和帕特里克·麦克高恩(Patrick McGoohan)的“囚徒”的年代分析。
机译:人造陨石中形成孢子的嗜热细菌幸免进入FOTON-M4卫星着陆舱进入地球大气层
机译:在EaRTH上着陆CG:进化路径上细粒度多线程的案例研究
机译:与伊利诺伊州西芝加哥稀土设施退役有关的最终环境声明。 Kerr-mcGhee Chemical Corporation的案号40-2061。第1卷:主要文本和附录a-G,补编