首页>
外文OA文献
>Performance Modeling and Measurement of Parallelized Code for Distributed Shared Memory Multiprocessors
【2h】
Performance Modeling and Measurement of Parallelized Code for Distributed Shared Memory Multiprocessors
展开▼
机译:分布式共享内存多处理器并行代码的性能建模和测量
展开▼
免费
页面导航
摘要
著录项
引文网络
相似文献
相关主题
摘要
This paper presents a model to evaluate the performance and overhead of parallelizing sequential code using compiler directives for multiprocessing on distributed shared memory (DSM) systems. With increasing popularity of shared address space architectures, it is essential to understand their performance impact on programs that benefit from shared memory multiprocessing. We present a simple model to characterize the performance of programs that are parallelized using compiler directives for shared memory multiprocessing. We parallelized the sequential implementation of NAS benchmarks using native Fortran77 compiler directives for an Origin2000, which is a DSM system based on a cache-coherent Non Uniform Memory Access (ccNUMA) architecture. We report measurement based performance of these parallelized benchmarks from four perspectives: efficacy of parallelization process; scalability; parallelization overhead; and comparison with hand-parallelized and -optimized version of the same benchmarks. Our results indicate that sequential programs can conveniently be parallelized for DSM systems using compiler directives but realizing performance gains as predicted by the performance model depends primarily on minimizing architecture-specific data locality overhead.
展开▼