The lack of performance portability has been disheartening scientific application users to develop portable programs written in HPF. As the users would like to run the same source code on different parallel machines as fast as possible, we have investigated the performance portability for Japanese HPF compilers (NEC and Fujitsu) with a special benchmark suite. We got god performance in most cases with DISTRIBUTE and INDEPENDENT directives on NEC SX-5, but Fujitsu VPP800 required to explicitly force no communication inside parallel loops with additional LOCAL directives. It was also found that manual optimizations for communication with HPF/JA extensions were very useful to tune parallel performance.
展开▼