Exercise 6: Set Up a Mini MPI Cluster
EXERCISE DESCRIPTION:
In this exercise you will team up with a few neighbors (from at least 1 to a maximum of 3) to build a true parallel machine composed by 2 to 4 desktop PCs. You can then run simple parallel program on top of them and compare MPI performances with respect to a standalone machine.
EXERCISE GUIDE
Task: Setting up a mini-“cluster”
1. Form teams of 2 to 4 people. Find out the IP
of the PCs that will form the cluster. You can find out the IP
by issuing the command:
$ /sbin/ifconfig
You will need these IP
addresses in later steps.
2. You will need super-user privileges, so become the root
user:
$ su -
3. All the PCs in the parallel cluster will need an identical user account (that is: same user name, user id, home directory, etc).
You can add a new user (named "mpi") with the following command:
# useradd -m -d /home/mpi -u 10000 -s /bin/bash mpi
With the following command, set the password of the "mpi" user to 12345678:
# passwd mpi <enter twice: 12345678>
4. Enable the "mpi" user account to login on both PCs without a password.
This can be accomplished with a feature of the ssh
program named "public-key" authentication.
Steps b to d below are performed on one PC only,
Which PC does not matter, as long as it is part of your cluster.
a. Install NFS-server:
#apt-get install nfs-common nfs-kernel-server
b. Edit the file /etc/exports
add the following line:
/home IP1(rw,no_root_squash) IP2(rw,no_root_squash) ...
where IPx
are the IP
addresses of the client machines.
Execute the following commnads
#exportfs -ra
c. Become user "mpi":
# su - mpi
d. Change current directory to .ssh
:
$ cd ~/.ssh
If the directory does not exist yet, just create it with:
$ mkdir ~/.ssh $ chmod go-rwx ~/.ssh
(if it exists already make sure you do not have group write access privileges or this wont work)
e. Generate a private/public identity key pair with:
$ ssh-keygen -t dsa
Hit return (2 times) when asked for a password.
This creates 2 new files id_dsa
and id_dsa.pub
.
f. Copy the id_dsa.pub
file under the name authorized_keys
:
$ cd ~/.ssh $ cp id_dsa.pub authorized_keys
You should have found the other PC host names in step 1.
g. From each PC, try to log into all other machines:
$ ssh mpi@other-pc
You should be able to login without a password prompt.
The following should be performed on all the slave nodes in the cluster
5. Install NFS-client
#apt-get install nfs-common
Edit /etc/fstab
, and add the following line
IP:/home /home nfs rw,rsize=4096,wsize=4096,hard,intr,async,nodev,nosuid 0 0
where IP is the IP
address for the server. Now reboot your client machine
6. Place the executable you want to run in a directory on one of the machines (does not matter which one, as the /home
is shared)
For instance, you can copy the executable you compiled before IMB-MPI1
to the directory /home/mpi/test
. To do this open a root console and type
# cp /home/myname/IMB-3.0/src/IMB-MPI1 /home/mpi/test # chown mpi:mpi /home/mpi/test/IMB-MPI1
Return to the user console
7. Edit a host file in the @@/home/mpi@ directory:
$ gedit /home/mpi/myhosts
This file should list all the PCs in your cluster, for example:
IP1 slots=2 IP2 slots=2 . . . IPn slots=2
where IPX
is the IP
address of the machines you wish to use. You should have identified the other PC IP
in step 1.
The following steps are performed on one PC only
Which PC does not matter, as long as it is part of your cluster.
If you have performed the above steps correctly, you will be able to run MPI programs from any PC in the cluster and exploit the full multi-processor power from any PC in the cluster.
8. To run a mpi code (eg. the IMB) across different nodes you should type (eg. 2 PCs)
$ mpirun --hostfile myhosts -np 4 ./IMB-MPI1
9. Issue the top
command on all PCs to make sure the code is running everywhere in the cluster.
10. Now rerun the benchmark and save the program output on a file as in the previous exercise. Observe the difference in performance compared to the run on a single PC. Why is there such a difference?