Direct Links
For computing nodes that have more than one GbE port, it might be a good idea to do direct connection between two nodes and taking advantage of the network speed for parallel computation. What I did was to set up IP addresses for the direct link ports, for instance, 192.168.0.1 for odd numbered nodes, 192.168.0.2 for even numbered nodes.
PBS/Torque
In order to access the specified resource through OpenPBS/Torque, we need to create a customized queue for the paired nodes since it's a peer to peer direct link. What I did was to use qmgr and import these commands:
# I copied the resources settings from the workq of OSCAR 4, # it might be different from the default workq of OSCAR 5 create queue subpair01 set queue subpair01 queue_type = Execution set queue subpair01 resources_max.cput = 10000:00:00 set queue subpair01 resources_max.ncpus = 8 set queue subpair01 resources_max.nodect = 2 set queue subpair01 resources_max.walltime = 10000:00:00 set queue subpair01 resources_min.cput = 00:00:01 set queue subpair01 resources_min.ncpus = 1 set queue subpair01 resources_min.nodect = 1 set queue subpair01 resources_min.walltime = 00:00:01 set queue subpair01 resources_default.cput = 10000:00:00 set queue subpair01 resources_default.ncpus = 1 set queue subpair01 resources_default.nodect = 1 set queue subpair01 resources_default.walltime = 10000:00:00 set queue subpair01 resources_available.nodect = 2 set queue subpair01 enabled = True set queue subpair01 started = True set node node1.local,node2.local properties+=subpair01Actually, you can save the commands into a file and use
gmgr < ./commandsBy the way, before going into the parallel computation on direct link, make sure your ssh won't croak the signature stuffs. Make sure to use c3 to have it done (Like:
cexec :1-2 ssh 192.168.0.1 uptime; cexec :1-2 ssh 192.168.0.2 uptime
). (Actually, there are a lot of potential problem about ssh, I believe the signature problems are simplified by OSCAR installation.)
MPICH
Here I use a PBS script to submit my MPICH jobs, this example is for AMBER jac benchmark. Please read the script and see how I specify MPI_HOST to tell MPICH the routing of message traffic.
#!/bin/sh #PBS -N "MPICHjob" #PBS -q subpair01 #PBS -l nodes=2:subpair01:ppn=8 #PBS -S /bin/sh #PBS -r n cd /home/demo/MPICH_SUBPAIR # customized machinefile cat > machine.subpairN << EOF 192.168.0.1:4 192.168.0.2:4 EOF # Tell mpich to run through the direct link export MPI_HOST=`/sbin/ifconfig eth1 | grep "inet addr:" \ | sed -e 's/inet addr://' | awk '{print $1}'` # Recommended by Dave Case in the Amber mail list export P4_SOCKBUFSIZE=524288 # Run source /opt/intel/fc/9.0/bin/ifortvars.sh /home/software/mpich_net/bin/mpirun -machinefile ./machine.subpairN -np 8 \ /home/software/amber9/exe/pmemd.MPICH_NET -O -i mdin.amber9 -c \ inpcrd.equil -p prmtop -o /tmp/output.txt -x /dev/null -r \ /dev/null # Data Retreival mv /tmp/output.txt output.pmemd9.MPICH_SUBPAIR
LAM/MPI
Then these are the script for LAM/MPI, you can see I still need to specify the routing of traffic. Also the first node defined by lamboot may not be the same node that PBS send you to.
#!/bin/sh #PBS -N "LAMMPIjob" #PBS -q subpair01 #PBS -l nodes=2:subpair01:ppn=8 #PBS -S /bin/sh #PBS -r n cd /home/demo/LAM # customized machinefile cat > machine.subpairN << EOF 192.168.0.1 cpu=4 192.168.0.2 cpu=4 EOF # if we don't specify -ssi boot rsh, lam will use boot tm and # the IPs provided by pbs that uses the oscar lan. /opt/lam/bin/lamboot -ssi boot rsh -ssi rsh_agent "ssh" -v machine.subpairN # Run source /opt/intel/fc/9.0/bin/ifortvars.sh /opt/lam/bin/mpirun -ssi rpi sysv -np 8 \ ./sander9.LAM -O -i mdin -c inpcrd.equil -p prmtop \ -o /tmp/output.txt -x /tmp/trajectory.crd -r /tmp/restart.rst /opt/lam/bin/lamhalt >& /dev/null # Data Retreival # becuase the master node is n0, not the first node of pbs ssh 192.168.0.1 mv /tmp/output.txt /tmp/trajectory.crd /tmp/restart.rst /home/demo/LAM
No comments:
Post a Comment