Thursday, September 01, 2005

Preventing Users From Not Using PBS Loging on Computing Nodes

Normally an experienced beowulf cluster administrator would probably suggest people not to login the computing nodes directly. However we (I myself is a user, too.) tend to connect to the computing nodes and run something on them without going through the scheduler (or resource allocator we might say.). All right, just a small job and you don't want to run it on the server because the server is often busy. The administrator would probably be mad because he or she cannot let the resource fairly accessible to the users. Therefore following command (but only for OSCAR cluster or other cluster with PBS/Torque installed) I guess is for administrators to recommend their users to run:

$ qsub -I -N "interactivejob" -S /bin/tcsh -q workq -l nodes=1:ppn=1
This will let users login the computing nodes through the scheduler.

Do we think the users will follow the rules and giving up logining into the nodes? No we are not stupid. A civilized but lazy way is to beg the users in the /etc/motd:

Please! Please do not ssh into the node! We beg you!
Of course this doesn't work on hackers. Unfortunately people usually think of themselves as hackers. So if we put following local.csh script in the /etc/profile.d/ on the nodes, you can stop the manual login thru ssh:
if ( ! $?PBS_ENVIRONMENT ) then
   if ( $?SSH_TTY && `whoami` != "root" ) then
      echo; echo please stop login the node thru ssh; echo
      logout
   endif
endif
Or, as Jenna (in #oscar-cluster @ FreeNode) pointed out, use local.sh for bash/sh users:
[ -z "$PBS_ENVIRONMENT" -a "$SSH_TTY" -a `whoami` != "root" ] && logout
This kind of design will not prevent users from using cexec, mpi, qsub or pbsqsh. However it doesn't guarantee users are absolutely not able to ssh to the nodes. If users intend to do so, the admin should use more civilized communication skills, not go into a technical fight.

As this problem going away, now we face another problem. People will just do their stuff on the server because they can't login into the nodes. And qsub is such a hassle a genius won't use it. Screw you guys, I am going home. 凸

No comments: