STARCD STAR-CCM+ Integration

From GridWiki
Jump to: navigation, search

Contents

STARCD

These notes apply to STAR-CD v4.x only.

The integration of STAR-CD in the GridEngine is mostly without any significant issues, however some adjustments are advisable to ensure smooth operation. The main difficulty is posed by the rsh/ssh and rcp/scp transports used in STAR-CD. On many newer systems, unsafe services such as telnet, rsh and rcp are disabled by default. Even if you decided to simply activate rsh/rcp on your systems (and possibly violate the corporate security policy), there are good technical reasons not to do so. This plain rsh service needs to be activated on each cluster node – leading to potential problems if it is forgotten with new installations. Beyond this, using a plain rsh will leave behind processes on the nodes when the job is removed with qdel (more details in http://gridengine.sunsource.net/howto/mpich-integration.html). For correct behaviour, the GridEngine rsh wrapper must be used. This wrapper, which is normally found under $SGE_ROOT/mpi/rsh, simply wraps rsh to use the GridEngine qrsh with the -inherit option. The final backend transport that is actually used for qrsh can itself be the GridEngine builtin (new with GridEngine 6.2), the GridEngine version of rsh that handles ports or it can be the standard ssh. The default GridEngine builtin version works without any known difficulties.

Parallel Environment

Here is an example of a parallel environment for a tight integration:

 pe_name            mpich
 slots              999
 user_lists         NONE
 xuser_lists        NONE
 start_proc_args    /opt/grid/mpi/startmpi.sh -catch_rsh $pe_hostfile
 stop_proc_args     /opt/grid/mpi/stopmpi.sh
 allocation_rule    $fill_up
 control_slaves     TRUE
 job_is_first_task  FALSE
 urgency_slots      min
 accounting_summary FALSE

The start_proc_args contains the -catch_rsh option, which links the $SGE_ROOT/mpi/rsh wrapper in $TMPDIR. Since this directory is automatically added to the path of the GridEngine jobs, the wrapper will generally be seen before any other rsh in the path. However, explicitly specify the use of the wrapper rather than relying on the correct path order for the proper behaviour of the system seems more prudent.


GridEngine Configuration

To reap the benefit of the tight integration, GridEngine should be configured to kill shepherded processes via the process group:

 execd_params                 ENABLE_ADDGRP_KILL

STARCD rsh wrapper

As described above, the GridEngine qrsh must be used, which is addressed by using an rsh wrapper. Unfortunately STARCD also contains its own rsh wrapper ($STARDIR/sbin/rsh) to handle switching between an rsh and an ssh transport, as well as a rcp wrapper ($STARDIR/sbin/rcp). It also relies upon modifying the path to include the $STARDIR/sbin. Correct behaviour of the $STARDIR/sbin/{rcp,rsh} scripts depend upon the following environment variables:

 REMOTECOPY
 REMOTETASK

When set, they are used to specify secure alternatives to rcp and rsh.

Where's the problem?

Based on the description thus far, there don't seem to be any potential problems.

  • The STARCD rsh wrapper ($STARDIR/sbin/rsh) is seen first in the path (it is placed there within the star script).
  • The STARCD rsh wrapper strips the $STARDIR/sbin out of the path before calling the real rsh.
  • The real rsh is in fact the next one found in the path, which should be the $TMPDIR/rsh link to $SGE_ROOT/mpi/rsh that was placed there by the $SGE_ROOT/mpi/startmpi.sh starter with the -catch_rsh option.
  • The $TMPDIR/rsh (link to $SGE_ROOT/mpi/rsh) will call the GridEngine qrsh, which in turn uses rsh, ssh or builtin for the transport.


But what happens when there are no rsh/rcp services on the cluster? In the case, the secure equivalents must be used:

 # provide secure access
 REMOTECOPY=/usr/bin/scp; export REMOTECOPY
 REMOTETASK=/usr/bin/ssh; export REMOTETASK

Now consider what occurs:

  • The STARCD rsh wrapper resolves rsh to /usr/bin/ssh, which is then used.
  • The GridEngine qrsh mechanism will be bypassed.
  • Using qdel to kill jobs results in zombie processes!

Thus for correct GridEngine control, we seem to require that REMOTETASK be unset:

 unset REMOTETASK

However, for the copying of files to work, we require

 REMOTECOPY=/usr/bin/scp; export REMOTECOPY

This is not only counterintuitive, but also means that testing a parallel job without the GridEngine will fail, since rsh will be used and this service is disabled on the system.

Summary

If we rely upon the standard STARCD mechanisms, the choice of REMOTETASK results either in a configuration that works will with GridEngine, but does not well when used directly, or else a configuration that works well when used directly, but which will leave behind zombie processes when used with the GridEngine.


Required Changes

For a configuration that works without the issues described above, the following changes are required:

Within the job script, add these lines before the star command is called:

 # provide secure access
 REMOTECOPY=/usr/bin/scp; export REMOTECOPY
 REMOTETASK=/usr/bin/ssh; export REMOTETASK
 # use GridEngine rsh wrapper to call GridEngine qrsh for the mpi transport
 # hp-mpi
 MPI_REMSH=$SGE_ROOT/mpi/rsh; export MPI_REMSH
 # mpich
 P4_RSHCOMMAND=$SGE_ROOT/mpi/rsh; export P4_RSHCOMMAND


To ensure that the values are used reliably, the following changes should be made to the $STAR/bin/star script. As always, make a backup copy first. It will be useful to determine what changes might be needed in future STARCD versions (unfortunately cd-adapco will not integrate the following suggested changes due to "unforeseeable repercussions" of introducing the RCP shell variable in the script).

Near the top of the $STAR/bin/star script, explicitly use the values of REMOTETASK, REMOTECOPY if they are set. This eliminates reliance on the order of the path. A typical diff:

--- star.orig   2009-02-26 22:14:32.000000000 +0100
+++ star        2009-04-22 14:43:38.197626000 +0200
@@ -23,13 +23,24 @@
 PNP_BUILDTIME="[2009-02-26-21:19:25]"

 #
-# Setups remote shell
+# Setup remote shell
 #
+# <rcp>
+RCP=rcp
+# </rcp>
 case `uname` in
 HP-UX) RSH=remsh;;
 *)     RSH=rsh;;
 esac
+
+# possibly use securer mode, even when the $STARDIR/sbin scripts are missing
+# <rcp>
+[ -n "$REMOTECOPY" ] && RCP=$REMOTECOPY
+[ -n "$REMOTETASK" ] && RSH=$REMOTETASK
+# </rcp>

Analogous to the existing RSH' shell variable, an extra RCP shell variable has been introduced. The next step requires a small amount of patience, but is simple – replace the remaining occurrences of rcp with the $RCP variable. For example,

--- star.orig   2009-02-26 22:14:32.000000000 +0100
+++ star        2009-04-22 14:43:38.197626000 +0200
@@ -1773,7 +1791,9 @@
 prepare_mmboot() {
   if [ "$PNP_MMNONFS" ]; then
     $RSH $PNP_MMHOST mkdir $PNP_MMDIR > /dev/null 2>&1
-    rcp $PNP_MMCOPY $PNP_MMHOST:$PNP_MMDIR > /dev/null 2>&1
+# <rcp>
+    $RCP $PNP_MMCOPY $PNP_MMHOST:$PNP_MMDIR > /dev/null 2>&1
+# </rcp>
   fi
 }


These changes provide a reliable GridEngine integration for STARCD. If STARCD is also to be used (in parallel) outside of the GridEngine control, the REMOTETASK and REMOTECOPY will still need to be set in your environment for the correct behaviour, but the $STAR/sbin/{rcp,rsh} wrappers are no longer required.


General STARCD Issues

Even after spending the effort to ensure that the STARCD processes stay under GridEngine control, there is a remaining problem: the star tracker daemon. The purpose of the star tracker daemon is to handle various hardware and process failures. One tracker daemon (we'll call it daemon1) runs on the same machine as the master MPI process. A second tracker daemon (we'll call it daemon2) runs on a machine of one of the slave processes. Both daemons consist, in fact, a while loop with a sleep (usually 64 sec.) and run as a backgrounded shell process. Since daemon1 is started from within the main star script, it will have the same Process Group Id and Session Id as the main star process and should terminate along with the sge_shepherd. The daemon2 serves to track if the machine with the master process is still available. Since this daemon is started by a rsh to one of the machines, it will not be killed off when the sge_shepherd exists but relies on some other scripted heuristics.

In many circumstances, the tracker daemon continues to exist (days, months, years) even after its parent calculation process has ceased to exist. These remnant tracker daemons are in principal harmless – they only sleep 64 seconds, ping another host and perhaps rsh to it to get its process table. Nonetheless, when several hundred remnant tracker daemons exist, they start to impinge on the system resources.

The only apparent solution at the moment is a starTrackerTracker cron job on each calculation node that does the following:

  • ps -eo pid,uid,etime,cmd
  • processes with /STAR/ are assumed to be from the STAR directory.
  • from these processes, ones with '/bin/star -trackd' are the tracker daemon processes.
  • the other processes are considered STARCD calculation processes.
  • rogue tracker processes are those with an elapsed time that does not correspond to any STARCD calculation process (± 1hr).
  • these pids are sent a TERM signal.


STAR-CCM+

No integration notes yet.

Personal tools
Namespaces

Variants
Actions
GridWiki Navigation
Toolbox