GridEngine launchd

From GridWiki
Jump to: navigation, search

Background

In testing of seed versions (as well as released versions up through 10.5.4) of Apple Computer's next Server OS Release ("Leopard") -- http://www.apple.com/server/macosx/leopard/ a concerted effort is being made towards deprecating the /etc/rc and SystemStarter framework in favor of a unified system under launchd.

This may become a problem in the future as current versions of Grid Engine Spesifikasi Samsung C7 Pro create SystemStarter scripts when SGE is installed on Apple OS X systems.

Even more unfortunate are the weaknesses in the launchd framework -- in particular the difficulty in handling service dependencies and controlling the order in which system services are started.Kamera Olympus Murah This appears to be difficult to do within launchd as it currently exists.

Mac OS X 10.5 to 10.5.3

Generally speaking Grid Engine runs well on these releases of Mac OS X "Leopard" Client and Server. The SystemStarter scripts installed by Grid Engine will reliably start SGE at boot time but may occasionally fail to correctly stop the daemons when the "sudo SystemStarter stop SGE" commands are run.Daftar Harga AC Murah The Grid Engine 'sgemaster' and 'sgeexecd' start/stop scripts located in $SGE_ROOT/$SGE_CELL/common/ work quite well.

Mac OS X 10.5.4 (and later)

Note: Non-Server versions of OS X 10.5.4 do not seem to exhibit the problems described below. This is may be a Mac OS X Server specific issue.

Starting with the release of OS X Server 10.5.4 users have experienced serious issues with Grid Engine installation and usage, particularly on server versions of the operating system.

The key symptoms are the following error messages:

 "can't get password entry for user "<username>". Either the user does not exist or NIS error!"

Or, if Grid Engine is installed with a non-root admin user, the system will report the following error followed by marking all queues into error state 'E':

 "admin_user "<username>" does not exist"

These errors appear either consistently or inconsistently and are no longer correlated with the use of OpenDirectory (problems appear with local user accounts as well), NFS, case-insensitive filesystems or any other system settings.

A definitive root cause for the problem has yet to be identified but a workaround has been discovered:

  1. Abandon use of the SystemStarter framework and scripts (/Library/StartupItems/SGE) installed by Grid Engine
  2. Do not use the $SGE_ROOT/$SGE_CELL/sgemaster and sgeexecd start/stop scripts any more
  3. Move all Grid Engine start/stop actions into the new 10.5.x launchd framework as described here http://blog.bioteam.net/2008/03/04/apple-os-x-105-launchd-scripts-for-grid-engine/

Running SGE via hand-made launchd files is not an optimal solution but it has been found to resolve the error messages described above.Spesifikasi Samsung Galaxy S8 Our best hypothesis at this time is that an update made starting with release 10.5.4 has rendered the SystemStarter framework and regular SGE start/stop scripts either unusable or unreliable.

More information and background on this issue can be found via these links

== External links ==]