This paper describes a comparative performance study of MPI and Remote Memory Access (RMA) communication models in context of four sci-entific benchmarks: NAS MG, NAS CO, SUMMA matrix multiplication, and Lennard Jones molecular dynamics on clusters with the Myrinet network. It is shown that RMA communication delivers a consistent performance advantage over MPI. In some cases an improvement as much as 50% was achieved. Bene-fits of using non-blocking RMA for overlapping computation and communica-tion are discussed.
展开▼