Help:Tool Labs/Grid

From Wikitech

Every non-trivial task performed in Tool Labs should be dispatched by the grid engine, which ensures that the job is run in a suitable place with sufficient resources. The basic principle of running jobs is fairly straightforward:

  • You submit a job to a work queue from a submission server (e.g., -login) or web server
  • The grid engine master finds a suitable execution host to run the job on, and starts it there once resources are available
  • As it runs, your job will send output and errors to files until the job completes or is aborted.

Jobs can be scheduled synchronously or asynchronously, continuously, or simply executed once. If a continuous job fails, the grid will automatically restart the job so that it keeps going.

What is the grid engine?

The grid engine is a highly flexible system for assigning resources to jobs, including parallel processing. The Tool Labs grid engine is implemented with Open Grid Engine (the open-source fork of Sun Grid Engine). You can find more documentation on the Open Grid Engine website.

Commonly used Grid Engine commands include:

  • qsub: submit jobs to the grid
  • qalter: modify job settings (while the job is waiting or running)
  • qstat: get information about a queued or running job
  • qacct: extracts arbitrary accounting information from the cluster logfile (also after job termination, useful for debugging)
  • qdel: abort or cancel a job

You can find detailed information about these commands in the Grid Engine Manual

The Open Grid Engine commands are very flexible, but a little complex at first – you might prefer to use the helper scripts instead (jsub, jstart, jstop) described in more detail in the next sections.

Submitting simple one-off jobs using 'jsub'

Jobs with a finite duration can be submitted to the work queue with either Open Grid’s 'qsub' command or the 'jsub' helper script, which is simpler to use and described in this section. (For information about qsub, please see the Open Grid Engine Manual.

To run a finite job on demand (at interval from cron, for instance, or from a web tool or the command line), simply use the 'jsub' command:

$ jsub [options…] program [args…]

By default, jsub will schedule the job to be run as soon as possible, and print the eventual output to files (‘jobname.out’ and ‘jobname.err’) in your home directory. Unless a job name is explicitly specified with jsub options, the job will have the same name as the program, minus extensions (e.g., if you have a program named foobot.pl and start it with jsub, the job's name will be foobot.)

Once your jobs has been submitted to the grid, you will receive an output similar to the one below, which includes the job id and job name.

Your job 120 ("foobot") has been submitted

jsub options

In addition to a number of customized options, jsub supports many, but not all qsub options:

Naming jobs

The job name identifies the job and can also be used to control it (e.g., to suspend or stop it). By default, jobs are assigned the name of the program or script, minus its extension. For instance, if you started a program named 'foobot.pl' with jsub, the job's name would be 'foobot'.

It's important to note that you can have more than one job, running or queued, bearing the same name. Some of the utilities that accept a job name may not behave as expected in those cases.

Specify a different name for the job using the jsub’s -N option:

jsub -N NewName program [args…]

Allocating additional memory

By default, jobs are allowed 256MB of memory; you can request more (or less) with jsub’s -mem option (or qsub's -l h_vmem=memory). Keep in mind that a job that requests more resources may be penalized in its priority and may have to wait longer before being run until sufficient resources are available.

$ jsub -mem 500m program [args…]

For example, loading a PHP script via jsub requires at least 350MB of memory to work properly:

jsub -mem 350m php /data/project/yourproject/public_html/test.php

Specifying Ubuntu release

There are two different versions of Ubuntu in use on Tools: 12.04 ('precise') and 14.04 ('trusty'). By default, jobs are run on exec hosts that run precise, but this will change to trusty in the future.

If you require a software package that is available in trusty, but not in precise, you can ask for your job to run on a trusty host with

$ jsub -l release=trusty program [args...]

If, for some reason, you require jobs to stay running on precise, even in the future (e.g. because you need an older version of some library), you can specify this with

$ jsub -l release=precise program [args...]

Synchronizing jobs

By default, jobs are processed asynchronously in the background. If you need to wait until the job has completed (for instance, to do further processing on its output), you can add the -sync y (for sync y[es]!) option to the jsub command:

$ jsub -sync y program [args...]

Running a job only once

If you need to make certain that the job isn't running multiple times (such as when you invoke it from a crontab), you can add the -once option. If the job is already running or queued, the grid engine will simply mark the failed attempt in the error file and return immediately.

$ jsub -once program [args...]

Quoted arguments

Jsub (actually qsub) always strips the quotes in the arguments of a job. If the arguments include any special bash characters like spaces, "|" or "&" then the job submission will likely fail, even when the arguments are given quoted to jsub (see bugzilla:48811).

The best way to avoid this issue is to use a wrapper script.

A simple workaround is to use two layers of quotes:

$ jsub program -arg1 "'^(foo|bar)$'"

Submitting continuous jobs (such as bots) with 'jstart'

Continuous jobs, such as bots, have a dedicated queue ('continuous') which is set up slightly differently from the standard queue:

  • Jobs started on the continuous queue are automatically restarted if they, or the node they run on, crash
  • In case of outage or lack of resources, continuous jobs will be stopped and restarted automatically on a working node
  • Only tool accounts can start continuous jobs
  • Continuous jobs are not restarted if they end normally (with the exit status 0)

For convenience, the jstart script (which accepts all the jsub options) facilitates the submission of continuous jobs:

$ jstart [options…] program [args…]

The jstart script will start the program in continuous mode (if it is not already running), and ensure that the program keeps running.

Note that the jstart script is equivalent to:

$ jsub -once -continuous program [args…]

jsub's '-once' option is important for ensuring that the job can be managed reliably with job and jstop utilities. The '-continuous' option ensures that the job will be restarted automatically until it exits normally with an exit value of zero, indicating completion.

Bigbrother

The bigbrother daemon will watch tool jobs you specify and restart them if they fail for any reason.

If one of the jobs it tracks is not running (or pending), it will attempt to start it again. Bigbrother will attempt to start a job up to ten times in a 24h window; throttling further restarts.

If it restarts a job, or fails to do so, it will send an email to the tool's maintainers, and log to `~/bigbrother.log`

For every job you want to watch, you have add a line to `~/.bigbrotherrc` (that file is checked for jobs to watch every couple minutes).

It understands:

 jstart -N <jobname> [more options]

will watch for a continuous job by the specified name, using the specified command line to restart it if it stops. Please note that the -N option is mandatory, and must be the first specified option.

Any other entry will be ignored, and cause an error (that will also be mailed to the tool's maintainers).

Any output from jstart will be appended to the `bigbrother.log` file.

Note that you do not need to use this for webservices any more. Those are handled by service manifest monitors.

Managing Jobs

Each job submitted to the grid has a unique job id as well as a job name (which will not be unique if you have more than one instance running). The name and id identify the job, and can also be used to retrieve information about its status.

If you don’t know the job id, you can find it with either the ‘job’ command or the ‘qstat’ command. Both of these commands can also be used to return additional status information, as described in the next sections.

Finding a job id and status with the ‘job’ command

If you know that your job has only one instance running (if you used the -once option when starting it, for example) you can use the ‘job’ command to get its job id:

tools.xbot@tools-login:~$ job xbot
717898

Use the job command’s -v (‘verbose’) option to return additional status information:

tools.xbot@tools-login:~$ job -v xbot
Job 'xbot' has been running since 2013-04-01T21:00:00 as id 717898

The verbose response is particularly useful from scripts or web services.

Once you know the job id, you can use the ‘qstat’ command to return additional information about it. See Returning the status of a particular job for more information.

Using ‘qstat’ to return status information

The ‘qstat’ command returns detailed information about the status of queued jobs. If you know the job id of a particular job, you can use qstat’s ‘-j’ option to return information about that job. If you use the ‘qstat’ command without options, it will return the status of all your currently running and pending jobs. More information about running qstat without options and with the -j option is included in the following sections. For more information about qstat in general, please see the Open Grid Manual.

Returning the status of all your queued jobs

To see the status of all of your running and pending jobs (including the job number), use the ‘qstat’ command without options. ‘qstat’ will then return the job id, priority, name, owner, state (e.g., r(unning) or s(uspended)), the date and time the job was submitted or started, and the name of the assigned job queue (e.g., continuous) for each job.

For example:

tools.xbot@tools-login:~$ qstat
job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID 
-----------------------------------------------------------------------------------------------------------------    
120    0.50000   xbot   tools.xbot         r     04/01/2013 21:00:00 continuous@tools-exec-01.pmtpa     1        

Common job states include:

  • r (running)
  • qw (queued/waiting)
  • d (deleted)
  • E (error)
  • s (suspended)

See the Open Grid Manual for a complete list of states and abbreviations.

Returning the status of a particular job

If you know the job Id of a job, you can find out more information about it using the 'qstat command's ‘-j’ option. For example, the following command returns detailed information about job id 990.

tools.toolname@tools-login:~$ qstat -j 990
==============================================================
job_number:                 990
exec_file:                 job_scripts/990
submission_time:            Wed Apr 13 08:32:39 2013
owner:                      tools.toolname
uid:                        40005
group:                      tools.toolname
gid:                        40005
sge_o_home:                 /data/project/toolname/ sge_o_log_
name:                           tools.toolname
sge_o_path:                 /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/X11R6/bin
sge_o_shell:                /bin/bash
sge_o_workdir:              /data/project/toolname
sge_o_host:                 tools-login
account:                    sge
stderr_path_list:           NONE:NONE:/data/project/toolname//taskname.err
hard resource_list:         h_vmem=256m
mail_list:                  tools.toolname@tools-login.pmtpa.wmflabs
notify:                     FALSE
job_name:                   epm
stdout_path_list:           NONE:NONE:/data/project/toolname//taskname.out
jobshare:                   0
hard_queue_list:            task
env_list:
script_file:                /data/project/toolname/taskname.py
usage    1:                 cpu=00:21:08, mem=158.09600 GBs, io=0.00373, vmem=127.719M, maxvmem=127.723M

Common shell exit code numbers[1] returned e.g. by qacct include (there are no standard exit codes, aside from 0 meaning success - non-zero doesn't necessarily mean failure):

exit_status Meaning Example Comments
0 Success No errors, meaning success
1 Catchall for general errors let "var1 = 1/0" Miscellaneous errors, such as "divide by zero" and other impermissible operations
2 Misuse of shell builtins (according to Bash documentation) empty_function() {} Missing keyword or command
126 Command invoked cannot execute /dev/null Permission problem or command is not an executable
127 "command not found" illegal_command Possible problem with $PATH or a typo
128 Invalid argument to exit exit 3.14159 exit takes only integer args in the range 0 - 255
128+n Fatal error signal "n" kill -9 $PPID of script $? returns 137 (=128+9)
128+2=130 Script terminated by Control-C Ctrl-C Control-C generates SIGINT which is fatal error signal 2
128+9=137 Process terminated by kernel (no further signal handling performed) kill -9 $PPID of script Kernel immediately terminates any process sent this signal, generating SIGKILL which is fatal error signal 9
128+11=139 Segmentation fault (kernel killed process due to segfault) E.g. the program accessed a not assigned memory location, generating SIGSEGV which is fatal error signal 11
255 Exit status out of range exit -1 exit takes only integer args in the range 0 - 255

Confer the signal(.h) man pages for a more comprehensive list of the values ("n") of the possible fatal error signals (SIG...) issued by the kernel.

Stopping jobs with ‘qdel’ and ‘jstop’

If you started a job with the 'jstart' command, or if you know there is only one job with the same name, then you can also use the 'jstop' utility command with the job name to stop it:

jstop job_name

You can also use the underlying ‘qdel’ command with a job’s number or name:

qdel job_number/job_name

This will also delete matching jobs that have only been queued, but not started yet. Do note that if you specify a 'job_name', all queued or running jobs with that name are deleted.

If you do not know the job number, you can find it using the ‘qstat’ command.

Restarting jobs

To stop and restart a running job in a single command (e.g. you made a bugfix), use:

qmod -rj job_number

Suspending and unsuspending jobs with ‘qmod’

Suspending a job allows it to be temporarily paused, and then resumed later. To suspend a job use:

qmod -sj job_id

The job will be paused (SIGSTOP). Note that the qstat command will return a state of ‘s’ for suspended jobs. If you do not know the job number, you can find it using the ‘qstat’ command.

To unsuspend the job and let it continue running use:

qmod -usj job_id

Unsuspended jobs should return to the 'r' state in qstat.

Scheduling jobs at regular intervals with cron

To schedule jobs to be run at specific days or time of days, you can use cron to submit the jobs to the grid.

Scheduling a command more often than every five minutes (for example * * * * * command) is highly discouraged, even if the command is "only" jsub. In these cases, you very probably want to use 'jstart' instead. The grid engine ensures that jobs submitted with 'jstart' are automatically restarted if they exit.


Creating a crontab

Crontabs are set (as on any Unix system) using "crontab -e" or "crontab FILE".

Note that the PATH is set differently for interactive shells and cron jobs.

Please be aware that any submitted crontab is automatically going to be edited to send any jobs to the grid directly (by prepending a default jsub invocation unless the cron entry already had one).

If your cron entry only includes a brief script that, itself, sends any real work to the grid then you may skip that automatic invocation by prepending jlocal explicitly marking it as a local job. Any script or job invoked with jlocal should not be running more than a few seconds and use minimal resources; misuse of that feature may have severe impact on general reliability for all users and is not allowed.

Specifying time zones

The ‘tools’ project, like other hosting environments, uses the time zone UTC (to view UTC time just write date). If you need to schedule a job for another time zone, you can specify so in the crontab. For example, to schedule a job for midnight in Germany, you can use the crontab line:

0 22,23 * * * [ "$(TZ=:Europe/Berlin date +\%H)" = "00" ] && jsub ...

The above crontab line instructs the system to check on 22:00 UTC (23:00 CET and 0:00 CEST) and 23:00 UTC (0:00 CET and 1:00 CEST) whether it is midnight in Berlin, and if so, calls jsub. Note that you can't just replace "Berlin" with "Hamburg"; the values for TZ are limited to those found at /usr/share/zoneinfo. If you're unsure what the offset of your time zone to UTC is, you can run the check hourly by replacing 22,23 with *.


FAQ

My shell script job fails with "Exec format error"

The program you want to execute must either be a binary executable or a script. In the latter case, it must contain a shebang line with the name of the interpreter (/usr/bin/perl, /usr/bin/python, etc.). For shell scripts that means in most cases the first line needs to be #!/bin/bash.

Notes

  1. http://stackoverflow.com/questions/1101957/are-there-any-standard-exit-status-codes-in-linux