Reliable Multicasts for Beowulfs: Software
This software allows faster computations on Beowulf and similar clusters
by using point-to-multipoint communications to broadcast information
between nodes. See the project description.
Version 0.94, December 2002
Conceptual changes of consequence include:
- Recovery at the UDP datagram level instead of MPI message level.
Much better performance for messages >> 32 KB.
- Channel state eliminated; simpler shutdown sequence
- Resend over EFAULT
Usage
Include "mpim.h"
Call MPIM_Init() instead of (or after) MPI_Init()
and before first broadcast. Fortran note 2001nov21: you must call MPI_Init() from Fortran
before calling MPIM_Init(); C users can optionally skip MPI_Init()
Call MPIM_Finalize() instead of MPI_Finalize()
(eliminates alarm interaction with MPI shutdown).
Then replace your call(s) to MPI_Bcast() with call(s)
to MPIM_Bcast() with exactly
the same arguments.
Fortran interfaces for all three calls are included.
Notes
- This version tested only under RedHat Linux 6.2-7.2 on Intel and Alpha.
Previous versions have been tested with Sparc Solaris 7. Access
to larger clusters for testing or your results are desired.
- Use with mpich; versions 1.2.1-1.2.4 used for testing and development.
Version 0.94
File descriptions:
- Version 0.94: mpim-2002dec16.tgz: latest version; complete; less thoroughly tested
- nack.c : multicast and TCP ACK functions.
Nominally independent of MPICH
- mpim.c/.h : glue to MPICH: MPIM_Init(), MPIM_Finalize(),
MPIM_Bcast()
- mpimtest.c : simple example program sends an integer nloop times
using MPI_Bcast() or MPIM_Bcast().
- mpimf.c : fortran interfaces to C functions; compile with
-DDOUBLE for function names with __ appended
- channel.c/.h : a convenience structure and fns to manipulate it; basically a socket
- comm.c/.h : ditto; a comm is a set of channels.
- ring.c/.h : ring buffers management
Peter Tamblyn / ptamblyn@astro101.com
Last modified: December 16, 2002