GridMPI 1.1 review
DownloadGridMPI is a new open-source free-software implementation of the standard MPI (Message Passing Interface) library designed for the Gr
|
|
GridMPI is a new open-source free-software implementation of the standard MPI (Message Passing Interface) library designed for the Grid. GridMPI project enables unmodified applications to run on cluster computers distributed across the Grid environment.
GridMPI team found that it is feasible to connect cluster computers and to run ordinary scientific applications in distance upto 500 miles. Simple experiment has shown that most MPI benchmarks scale fine upto 20 millisecond round-trip latency which corresponds to about 500 miles in distance, when the clusters are connected by fast 1 to 10 Gbps networks. 500 miles covers the major cities between Tokyo--Osaka in Japan.
Thus, applications which are too large to run on a local cluster should run on multiple clusters in the Grid environment with acceptable performance. However, it is only feasible when using an efficient MPI implementation [1]. Existing implementations are not efficient enough mainly because of the two reasons: their focus on security features and TCP performance problems.
GridMPI skips security layers assuming dedicated secure links. The institutes housing large clusters tend to have their own networks to connect to other institutes in most cases. GridMPI so focuses on the performance on TCP. Since existing implementations are in most cases designed for MPP machines and recently clusters with special hardware, their performance on TCP with Ethernet is not optimal.
Also TCP performance itself is not optimal for the work load of the MPI traffic. In addition, support for heterogeneous combinations of computers of the existing MPI implementations is not satisfactory. Thus, GridMPI is designed and implemented from the scratch. GridMPI is carefully coded and tested with heterogeneity in mind.
Here are some key features of "GridMPI":
Full conformance to the standard: GridMPI passes 100% of the functional tests of the large test suites from ANL and Intel (MPI-1.2 level).
Full heterogeneity support: GridMPI is fully tested with combinations of processors of 32bit/64bit and big/little-endian.
Primary support of TCP/IP and sockets: GridMPI is written from scratch and it is new and clean. It is efficient with sockets, and thus suitable for the Grid as well as ordinary Ethernet-based clusters.
Cooperation with Grid job submission: GridMPI can be used with Globus, Unicore, tool from NAREGI project, etc.
Checkpointing support: GridMPI supports checkpointing on Linux/IA32 platforms to restart long-running applications from failure.
Vendor MPI support: GridMPI supports IBM-MPI, Fujitsu-Solaris-MPI, Intel-MPI, and any MPICH-based MPI for clusters with special communication hardware.
What's New in This Release:
Minor bugfixes were made.
GridMPI 1.1 search tags