Difference between revisions of "Install and configure Grid Engine in heterogenic environment on Linux and Windows with MPICH2"

From GridWiki
Jump to: navigation, search
m (Reverted edits by LindaLopez (talk) to last revision by Szczelba)
Line 148: Line 148:
* Unpack and copy them to proper directories
* Unpack and copy them to proper directories
* [http://buletinnews.com/ buletin terbaru]
* Run the installation:
* Run the installation:
Line 169: Line 169:
* Add /usr/SGE/mpich2/bin to $PATH
* Add /usr/SGE/mpich2/bin to $PATH
* [http://www.buletintupperware.com/ tupperware juni]
* Check if smpd daemon is working, if not, run it by smpd -s
* Check if smpd daemon is working, if not, run it by smpd -s
Line 184: Line 184:
* Download and install Visual C++ 2005 SP1
* Download and install Visual C++ 2005 SP1
* [http://www.tabloidharga.com/ hp samsung s9]
* Download and install MPICH2 for Windows
* Download and install MPICH2 for Windows

Revision as of 09:34, 17 January 2017

Author: Jacek Strzelczyk <jacek.strzelczyk@gmail.com>

Basic software

  • Linux machines: Fedora Core 3
  • Windows machines: Windows 2000 SP4
  • Services For Unix 3.5
  • GridEngine 6.1u4
  • MPICH2 1.0.7

Developing software

  • gcc version 3.4.4
  • MS Visual C++ 2005 SP1
  • Dev-Cpp

Pre-install requirements


NIS is a service that provides information, that has to be known throughout the network, to all machines on the network. It can be very helpful in maintaining coherent user structure on all the nodes in grid. Full NIS HOWTO can be found here. For purposes of this installation one user account is needed. Name it 'sgeadmin' and add it to NIS database. Set the $HOME on “/usr/SGE”. Also /etc/hosts can be added to NIS.


Having a common filesystem to install and run SGE is a simple and flexible solution. It can be achieved in many ways, and I'll focus on NFS. The full NFS HOWTO can be found here. The easiest way would be installing NFS server on the machine purposed to be SGE master host. The rest of the hosts will be NFS clients.

NFS on Linux

NFS server

To prepare and share the directory with SGE do:

$ mkdir /usr/SGE
$ echo “/usr/SGE  M1(rw,no_root_squash,async) M2(rw,no_root_squash,async) M3(rw,no_root_squash,async)” >> /etc/exports	     
#	Where M1, M2 and M3 are the names of client hosts (need to be in /etc/hosts).

Restart NFS.

NFS client

$ mkdir /usr/SGE
$ chown sgeadmin /usr/SGE
$ mount -t nfs masternode:/usr/SGE /usr/SGE  #should be added to fstab with suid option

NFS on Windows

To mount network drive in Windows log in as Administrator and type:

>net use X: \\masternode\usr\SGE

To make it automatically at each system boot use AutoExNT (http://support.microsoft.com/kb/243486):

a) Using a text editor (such as Notepad), create a batch file named Autoexnt.bat and include the commands you want to run at startup in this file – that would be

@net use X:\ \\masternode\usr\SGE

b) Copy the Autoexnt.bat file you just created, in addition to the Autoexnt.exe, Servmess.dll, and Instexnt.exe files located in the Resource Kit CD-ROM (or here: http://www.dynawell.com/reskit/microsoft/win2000/autoexnt.zip) to the C:\WINNT\System32 folder on your computer.

c) At a command prompt, type instexnt install, and then press ENTER.

You should then receive the following message:

CreateService AutoExNT SUCCESS with InterActive Flag turned OFF 

This will create AutoExNT service in Windows, that will automatically mount /usr/SGE as X: drive at boot time. To be sure it will happen after all network connections are up add some dependencies in windows registry. Open registry editor regedt32 (not regedit!). Go to HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\AutoExNT and add string value named “DependOnService” with value “LanmanWorkstation”.

SGE installation

SGE Linux master host

First, create /usr/SGE/.rhosts file containing all hosts in your SGE installation. Chmod it to 600. Check if rsh works on Linux machines by executing as sgeadmin:

$ rsh otherlinuxhost date

Then, add three lines to /etc/services:

sge_execd	535/tcp
sge_commd	536/tcp
sge_qmaster	537/tcp

Download SGE :

Common Files: http://gridengine.sunsource.net/download/SGE61/sge-6.1-common.tar.gz

Linux files: http://gridengine.sunsource.net/download/SGE61/sge-6.1-bin-lx24-x86.tar.gz

Unpack them:

#su - sgeadmin 
$mv sge-6.1-common.tar.gz /usr/SGE/ 
$mv sge-6.1-bin-lx24-x86.tar.gz /usr/SGE/ 
$cd /usr/SGE/ 
$tar -xvf sge* 

Before starting installation procedure, file util/arch needs to be edited. Change line 248 to:


and then:

$su -

Full installation procedure described in SGE Docs: http://docs.sun.com/app/docs/doc/817-6118/emrar?q=N1GE&a=view.

SGE Linux exec hosts

Described in SGE Doc: http://docs.sun.com/app/docs/doc/817-6118/emrar?q=N1GE&a=view.

SGE Windows exec hosts

In short, step by step:

  • Create user 'sgeadmin' locally.
  • Download Services For Unix (SFU) from here.
  • Turn off DEP by adding “/noexecute=alwaysoff” to C:\boot.ini under [boot loader] section.
  • Run SFU installation procedure and add Interix SDK and Interix GNU SDK to default installation.
  • Check if User Mapping daemon is working after installation is complete
  • Go to Menu Start -> Programs -> Windows Services for Unix -> Configuration -> User Name Mapping, choose NIS and Show User Maps. Then connect Unix user sgeadmin with Windows user of the same name.
  • Mount X: drive as /usr/SGE in Interix:
%ls -l /dev/fs    # should show also X
%ln -s /dev/fs/X /usr/SGE
  • Run telnet and rsh from Interix – log in to Windows as Administrator, turn off telnet and rsh daemons from Windows permanently, remove comment marks from rsh and telnet lines in /etc/inetd.conf in Interix, restart inet:
%ps -ef | grep inetd
%kill -1 <PIDofINETD>
  • Check Windows firewall and open ports 23 (telnet) and 514 (shell). Use nmap to check if everything is ok.
  • Add all grid machines to /etc/hosts in Interix
  • Add line “ ftp.interopsystems.com” to C:\WINNT\system32\drivers\etc\hosts so that ftp can reach this portal (otherwise there are problems in address translation)
  • Install bash:
%pkg_update -L bash
  • Create $HOME/.rhosts file in Interix containing all hosts in your SGE installation
  • Download windows specific SGE files (sge61u4_addarchs_targz.zip) from Sun's web page. Available after registration.
  • Unpack and copy them to proper directories
  • buletin terbaru
  • Run the installation:


MPICH2 on Linux

  • Download mpich2-1.0.7.tar.gz archive
  • Configure and install:
$ mkdir /usr/SGE/mpich2
$./configure –prefix=/usr/SGE/mpich2 –with-pm=smpd --with-pmi=smpd
$ make
$ make install
$ cd $HOME
$ echo “phrase=behappy” > .smpd
  • Add /usr/SGE/mpich2/bin to $PATH
  • tupperware juni
  • Check if smpd daemon is working, if not, run it by smpd -s
  • To compile MPI programs:
$ gcc mpi-test.c -ompi-test -I/usr/SGE/mpich2/include -L/usr/SGE/mpich2/lib -lmpich
  • Create credentials file:
$ echo “sgeadmin\n sgeadmin” > /usr/SGE/credentials
$ chmod 600 /usr/SGE/credentials

MPICH2 on Windows

  • Download and install Visual C++ 2005 SP1
  • hp samsung s9
  • Download and install MPICH2 for Windows
  • Check if MPICH2 Process Manager daemon is working
  • To compile programs in Windows install Dev-Cpp (or other programming environment)
  • Compile source code in Dev-Cpp with MPICH2 libraries and headers: -I”C:\Program Files\MPICH2\include” -L”C:\Program Files\MPICH2\lib” -lmpi
  • Copy the compiled program into C:\WINNT\system32 (or other directory from windows $PATH)

Add MPICH2 as parallel environment to SGE

Use qmon from SGE to manually add MPICH2 as parallel environment to SGE. Description of PE can be found here: http://gridengine.sunsource.net/howto/mpich2-integration/mpich2-integration.html

Configure Interix

Simple configuration needs to be done, so that Interix will start after network drive with SGE (drive X:) is mounted on Windows machine. Then, SGE can be started automatically by Interix startup script. To do that, add dependency to the Windows Registry:

  • open regedt32
  • Go to: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Interix and add string value named “DependOnService” with value “AutoExNT”.
  • Copy and adjust one of the Interix startup scripts from /etc/init.d to start SGE.
  • Add symbolic links to sge start script in /etc/rc2.d

Then, after Windows machine restart all network connections should be up, network drive X: (with SGE) should be mounted and Interix startup script should start SGE exec daemon. All automatically, with no need of user logging.

Post-install check

Ok, so now you should have:

  • Linux master host with: NIS and NFS servers, SGE master and SGE exec daemons running. Check by ps aux | grep sge – there should be three processes: sge_qmaster, sge_commd and sge_execd. Also MPICH2 daemon – smpd.
  • Linux execution hosts with: mounted /usr/SGE from master host, SGE exec and smpd daemons running.
  • Windows execution hosts with: mounted /usr/SGE as network drive X:, smpd daemon running (from Windows version of MPICH2) and Interix with SGE exec daemon.


To test the installation, create simple MPI program (or use example from mpich2/examples) and compile it both:

  • on Linux: gcc -I/usr/SGE/mpich2/include -L/usr/SGE/mpich2/lib -lmpi -ompi-test mpi-test.c
  • on Windows: use Dev-Cpp with arguments -I”C:\Program Files\MPICH2\include” -L”C:\Program Files\MPICH2\lib” -lmpi

Then copy output binary files to the one directory in $PATH, both on Windows (ex. C:\WINNT\system32) and Linux (ex. /usr/bin/). Create script (examples in /usr/SGE/examples) that executes mpirun:

mpirun -n $NSLOTS -machinefile $TMPDIR/machines -pwdfile /usr/SGE/credentials mpi-test

It should give you proper results (I hope...)!