- 1 About this wiki
- 2 Grid Computing & Grid Engine Overview
- 3 Documentation
- 4 HOWTOs
- 5 Utilities
- 6 Grid Engine wiki and related projects
- 7 Success Story
- 8 Frequently Encountered Problems
- 9 Stephan's Blog Posts
- 10 Community Contributions
- 11 Glossary
- 12 References
About this wiki
Note that references here to old Sun web pages may no longer be valid following the Oracle takeover. In particular, gridengine.sunsource.net no longer exists. There are currently three sites with Grid Engine source repositories derived from the sunsource one, and other material:
- Univa public repository (security updates since initial release only)
- Son of Grid Engine (a large superset of Univa's repo and the only public issue tracker) and Howtos etc
- Open Grid Scheduler
The core Grid Engine websites are:
- Oracle documentation (Oracle has taken down the Grid Engine pages after announcing a hand-over of support for Oracle Grid Engine to Univa.)
This wiki server (and the related http://gridengine.info blog) has been set up by members of the Grid Engine user community as a way of collecting and distributing usage, configuration, tips and HOWTO information in a fast and democratic way. We don't want to reinvent the wheel or waste time so wherever there are better online resources available, this Wiki will simply link out to them.
The best place to get help with Grid Engine related issues is the users mailing list.
Grid Computing & Grid Engine Overview
This is not the place for an academic dissertation or even a management-friendly summary of "grid" or "cluster" computing. If you are just starting out learning about clusters you may want to check out sites like http://www.clustermonkey.net. Some good intro-level information on Grid Engine and it's capabilities can be found here (these pages are no longer up as a fall-out from Oracle acquiring Sun in 2010):
- Chapter 1 of the 'N1 Grid Engine Users Guide'
- Introduction to the Cluster Grid - Part 1 (August 2002) (Sun BluePrint PDF)
Grid Engine is software that facilitates "distributed resource management" (DRM) -- other similar free software packages include Portable Batch System, Torque, SLURM, OAR and Lava, but they lack scheduling features and/or scalability in comparison. Far more than just simple load-balancing tools or batch scheduling mechanisms, DRM software typically provides the following key features across large sets of distributed resources:
- Policy based allocation of distributed resources (CPU time, software licenses, etc.)
- Batch queuing & scheduling
- Support diverse server hardware, OS and architectures
- Load balancing & remote job execution
- Detailed job accounting statistics
- Fine-grained user specifiable resources
- Suspend/resume/migrate jobs
- Tools for reporting Job/Host/Cluster status
- Job Arrays
- Integration & control of parallel jobs
Is Grid Engine commercial or open source software?
Both. Since 2011 Univa Corporation offers a continuously enhanced and commercially supported fork of the last open source Sun version 6.2u5. This commercial version is called Univa Grid Engine. Open source forks are Son of Grid Engine and Open Grid Scheduler.
After acquiring Sun Microsystems in 2010, Oracle was continuing to offer commercial Grid Engine versions for sale (Oracle Grid Engine versions 6.2u6-8) until September 2013 when it was handing over the Grid Engine IP and Oracle Grid Engine customer support to Univa as announced here.
Binary packages from the latest Sun open source version may still be found somewhere on the web. It is important to differentiate them for Sun's commercial packages. The latter were available as free downloads from Sun's download center as well but had a 90-day evaluation period after which the product would have to be bought or de-installed. This evaluation period got introduced with version 6.2u3 and later. The open source packages do not have this restriction. A license file in the installation directory informs about the type of package being used.
Yet older version are free to use for any purpose regardless of whether it is the open soruce build or the Sun product build. Commercial support through Sun was only available for the Sun product builds but this is not relevant anymore.
Also the following two subsections relate to old version which may still be referenced occasionally:
Grid Engine 5.x "Standard" vs. "Enterprise" Editions
Prior to the release of Grid Engine 6 in 2004, there were two different "flavors" of Grid Engine offered by Sun Microsystems. The "standard" edition was a Sun software product made available for free to any user or institution. The "Enterprise Edition" was available from Sun at additional cost. The only difference between the two offerings was the number of scheduling policy types supported. The standard edition generally supported only a "first-in, first-out" FIFO type scheduling policy along with a simple user-sort scheduling mechanism. The Enterprise Edition supported FIFO scheduling in addition to several other more flexible policy based scheduling mechanisms. Interestingly enough, the codebase and binaries for both flavors are 100% the same. The difference between "standard" and "enterprise edition" is triggered by a simple flag passed via an installation script. During the time that Sun Microsystems was offering two different types of "Sun Grid Engine", the open source site was offering both flavors of "Grid Engine" for free.
"Sun N1 Grid Engine 6" vs. "Grid Engine 6"
When Grid Engine 6 was first released, Chris Dagdigian wrote a simple whitepaper entitled Understanding the differences between Grid Engine 5.3, 6.0 and Sun N1 Grid Engine 6 (N1GE) - while semi dated it provides a good overview of the differences.
Major project news and milestones
(The news below refer to the old Sun sponsored project and are no longer of much relevance. Som of the links may be dead.)
- Further complicating the interesting relationship between the Sun branded and open source versions of Grid Engine, Sun made a surprise announcement in December 2005 where the company announced that (among many other software products), the full Sun N1 software stack including N1 Grid Engine would now be available for "free". More information on this announcement can be found online: http://gridengine.info/articles/2005/12/01/sun-n1-grid-engine-is-now-free
- Another major change for Grid Engine occurred when Sun announced at the Supercomputing 2006 conference that all of their commercial product add-ons for Grid Engine (ARCo and Windows client support) would be integrated with the open source Grid Engine codebase. The full announcement can be read here: http://gridengine.sunsource.net/news/SuperComputing2006.html
- Grid Engine 6.1 was released, the first major revision release since Grid Engine 6.0 was announced in 2004. Included among numerous improvements and enhancements is the new highly-capable Resource Quota subsystem.
If you'd rather have the Grid Engine product team tell you about Grid Engine, there's a video on YouTube from the Grid Engine team that introduces grid computing and talks about what Grid Engine is and how it's commonly used.
(Sun BluePrints have been taken off the web by Oracle. Maybe some web archives still have copies of the content.)
Sun maintains an interesting technical library of "BluePrint Documents". Interesting publications include:
- Sun Grid Engine, Enterprise Edition-Configuration Use Cases and Guidelines (July 2003)
- Global Grid Connectivity Using Globus Toolkit With Solaris Operating System (May 2004)
- Using Host Groups and Cluster Queues in the Sun N1 Grid Engine 6 System (August 2005)
- Scheduler Policies for Job Prioritization in the N1 Grid Engine 6 System (October 2005)
- Sun N1 Grid Engine Software and the Tokyo Institute of Technology Super Computer Grid (June 2007)
|LAM-MPI||Tight integration of Grid Engine and LAM-MPI|
|Intel MPI||Loose and Tight Integration|
|FLEXlm License Manager||New (Olesen) method with some configuration notes. Also with a github project page|
|FLEXlm License Manager||Old "load sensor" method (not recommended)|
|FLEXlm License Manager||FLEXlm license load sensor written in Python|
|LicenseJuggler||Sharing software licenses across multi Grid Engine sites|
|Matlab||The Grid Engine community is looking for Matlab integration methods and tips.|
|Ansys||Running Ansys applications as Grid Engine jobs.|
|Clearcase||If you are looking to use SGE to improve the clearcase build time the SGE is not your solution yet. Commercial solutions exists like Electric-Cloud and IBM's buildforge.|
|PEST||PEST is a general purpose, parameter estimation and optimization program that can be used with any simulation code. Notes and a skeleton script to integrate parallel PEST and Grid Engine can be found here.|
|STARCD STAR-CCM+ Integration||STARCD and STAR-CCM+ are general purpose CFD codes from CD-adapco.|
|Distributed-Compilation||Grid Engine has a tool called "qmake" which can help distribute large source code building tasks across a cluster of machines. Information on distributed builds has been moved to the Distributed-Compilation wiki page.|
|Dytran||Running Dytran applications as Grid Engine jobs.|
|Hadoop||Grid Engine can be used to run MapReduce jobs. A tighter integration is available from Univa ("Share with Hadoop" tab).|
|MD Nastran||Guidelines on useful SGE configuration for typical MD Nastran user. Also see Part 1, Part 2, Part 3 and Part 4 on running MD Nastran (serial and parallel) as Grid Engine jobs.|
|Macintosh OS X||GridEngine_launchd for notes on getting Grid Engine to function under the new launchd framework for system services in OS X 10.5 (Leopard)|
|Windows||You can start jobs on a UNIX/Linux machine from Windows using SSH and SAMBA.|
|Linux&Windows||Install and configure Grid Engine in heterogenous environment on Linux and Windows with MPICH2|
Integration with Compute Clouds
- Grid Engine on EC2
- Grid Engine on Joyent Accelerators
- One-Click HPC with Univa Grid Engine on Amazon AWS Marketplace
- One-Click HPC with Univa Grid Engine and RightScale
Datacenter Related Topics
Various short utilities for doing stuff with Grid Engine can be found on the Utilities page.
On the ARCo Queries, ARCo users can contribute and share their custom ARCo queries.
Documenting Grid Engine XML output
Documenting and understanding Grid Engine 'qping' output
Grid Engine XML parser project
- Dan has started Grid Engine XML parser project under Passau
Development specifications for 6.2 are kept on GridWiki
Development specifications for 6.1 are kept on GridWiki
but there are also others which are not yet implemented
Grid Engine Packaging Efforts
Packaging documentation and disussion has been moved to its own GE-Packaging wiki page. This is of interest to people working on alternative binary or source installation/distribution methods for Grid Engine.
There is a list of DRMAA success stories kept under http://www.drmaa.org/stories.php
Frequently Encountered Problems
Stephan's Blog Posts
This is an archive of Stephan Grell's blog posts.