MPl point-to-point communication is a basic operation, however it requires runtime matching of send and receive that causes to reduce performance. This paper proposes a new approach to send messages by remote memory write without inquiring of the receiver under a communication pattern such that the corresponding nonblocking receive is issued in advance. Basically, this approach makes it possible to gain low latency and high bandwidth as the hardware specification. MPI-EMX, our implementation of the MPI on the EM-X multi- processor, achieves a zero-byte latency of 15.3 βsec and a maximum bandwidth of 31.4 MB/s, which can compete with commercial MPPs. This approach to reduce communication latency is widely applicable to other systems and is quite a promising technique for achieving low latency and high bandwidth.
展开▼