Difference between revisions of "How to Submit a Job"

From Statistics Cluster
Jump to navigation Jump to search
 
(14 intermediate revisions by the same user not shown)
Line 1: Line 1:
== Job Submission ==
+
== Job policy ==
  
 
=== ClassAds ===
 
=== ClassAds ===
 
The Statistics Cluster is equipped with a powerful job queuing system called [http://research.cs.wisc.edu/htcondor/ Condor]. This framework provides efficient use of resources by matching user needs to the available resources by taking into account both the priorities for the hardware and the preferences of the job. Matching resource requests to resource offers is accomplished through the <b><i>ClassAds</i></b> mechanism. Each virtual machine publishes its parameters as a kind of <u>class</u>ified <u>ad</u>vertisement to attract jobs. A job submitted to Condor for scheduling may list its requirements and preferences.
 
The Statistics Cluster is equipped with a powerful job queuing system called [http://research.cs.wisc.edu/htcondor/ Condor]. This framework provides efficient use of resources by matching user needs to the available resources by taking into account both the priorities for the hardware and the preferences of the job. Matching resource requests to resource offers is accomplished through the <b><i>ClassAds</i></b> mechanism. Each virtual machine publishes its parameters as a kind of <u>class</u>ified <u>ad</u>vertisement to attract jobs. A job submitted to Condor for scheduling may list its requirements and preferences.
 +
 +
=== AccountingGroup ===
 +
There are 4 types of jobs that can be submitted to the cluster. These are:
 +
{| class="wikitable" style="text-align: center; background: #A9A9A9"
 +
| style="background: #f9f9f9; width: 200px" | <b>Job Type</b>
 +
| style="background: #f9f9f9; width: 200px" | <b>Resource Quota</b>
 +
| style="background: #f9f9f9; width: 200px" | <b>Maximum Runtime</b> 
 +
| style="background: #f9f9f9; width: 400px" | <b>Line Required in Submit File</b>
 +
|-
 +
| style="background: #f9f9f9; width: 200px" | standardjob
 +
| style="background: #f9f9f9; width: 200px" | 450
 +
| style="background: #f9f9f9; width: 200px" | 24 hours
 +
| style="background: #f9f9f9; width: 400px" | default, no additional line in submit file
 +
|-
 +
| style="background: #f9f9f9; width: 200px" | longjob
 +
| style="background: #f9f9f9; width: 200px" | 250
 +
| style="background: #f9f9f9; width: 200px" | 48 hours
 +
| style="background: #f9f9f9; width: 400px" | +AccountingGroup = "group_statistics_longjob.username"
 +
|-
 +
| style="background: #f9f9f9; width: 200px" | shortjob
 +
| style="background: #f9f9f9; width: 200px" | 250
 +
| style="background: #f9f9f9; width: 200px" | 8 hours
 +
| style="background: #f9f9f9; width: 400px" | +AccountingGroup = "group_statistics_shortjob.username"
 +
|-
 +
| style="background: #f9f9f9; width: 200px" | testjob
 +
| style="background: #f9f9f9; width: 200px" | 50
 +
| style="background: #f9f9f9; width: 200px" | 20 minutes
 +
| style="background: #f9f9f9; width: 400px" | +AccountingGroup = "group_statistics_testjob.username"
 +
|}
 +
 +
*When jobs are submitted to the cluster, Condor will assign resources to jobs to satisfy the resource quotas of the group. If 1000 jobs of each group are submitted, each group should have met its resource quota and the remaining jobs will sit waiting for the next resource.
 +
 +
*If a group submits more jobs than their quota, the surplus jobs will be regrouped with all other surplus jobs. These jobs will receive a resource based on user priority, as explained in [http://gryphn.phys.uconn.edu/statswiki/index.php/How_to_Submit_a_Job#User_Priority the next section].
 +
 +
*To prevent users from holding onto resources, maximum runtimes are enforced. When a job has gone beyond it's maximum runtime, a job in the queue has the potential to preempt the overtime job.
 +
 +
*Jobs may also be preempted if one group is over quota and new jobs from a different group are submitted. The new group is able to preempt the extra jobs up to the new group's quota and without reducing the running group's quota.
 +
 +
 +
<b>Users are expected to adjust their jobs to meet these run time requirements.</b>
  
 
=== User Priority ===
 
=== User Priority ===
When jobs are submitted, Condor must allocate available resources to the requesting users. It does so by using a value called <i>userprio</i> (user priority). The lower the value of <i>userprio</i> the higher the priority for that user. For example, a user with <i>userprio</i> 5 has a higher priority than a user with <i>userprio</i> 50. The share of available machines that a user should be allocated is continuously calculated by Condor and changes based on the resource use of the individual. If a user has more machines allocated than the <i>userprio</i>, then the value will worsen by increasing over time. If a user has less machines allocated than the <i>userprio</i>, then it will improve by decreasing over time. This is how Condor fairly distributes machine resources to users.
+
When jobs are submitted, Condor must allocate available resources to the requesting users. In addition to adhering to the Accounting Groups, it does so by using a value called <i>userprio</i> (user priority). The lower the value of <i>userprio</i> the higher the priority for that user. For example, a user with <i>userprio</i> 5 has a higher priority than a user with <i>userprio</i> 50. The share of available machines that a user should be allocated is continuously calculated by Condor and changes based on the resource use of the individual. If a user has more machines allocated than the <i>userprio</i>, then the value will worsen by increasing over time. If a user has less machines allocated than the <i>userprio</i>, then it will improve by decreasing over time. This is how Condor fairly distributes machine resources to users.
 +
 
 +
On the stats cluster, each student and faculty member are given a specific <i>priority factor</i> of 1000. This is used to calculate the effective priority of a user. Any non-UConn user of the cluster has a priority factor of 2000 so that priority is given to UConn users.  As users claim machines their effective priority will adjust accordingly.
 +
 
  
On the stats cluster, each student and faculty member are given a specific base <i>userprio</i>. Any non-UConn user of the cluster receives a different base value such that priority is given to UConn users.  As users claim machines their user priority will adjust accordingly.
+
== Job Submission ==
  
 
=== Submit File ===
 
=== Submit File ===
Line 15: Line 58:
 
</pre>
 
</pre>
  
A simple description file goes as follows:
+
A simple, standard group description file goes as follows:
  
<pre>Executable = myprog
+
<pre>Requirements = ParallelSchedulingGroup == "stats group"
Requirements = ParallelSchedulingGroup == "stats group"
 
 
Universe  = vanilla
 
Universe  = vanilla
 +
Executable = myprog
 +
Arguments = $(Process)
 +
request_cpus = 1
  
output    = myprog$(Process).out
+
output    = myprog-$(Process).out
error    = myprog$(Process).err
+
error    = myprog-$(Process).err
 
Log      = myprog.log
 
Log      = myprog.log
  
 +
transfer_input_files = myprog
 
should_transfer_files = YES
 
should_transfer_files = YES
 
when_to_transfer_output = ON_EXIT
 
when_to_transfer_output = ON_EXIT
 +
on_exit_remove = (ExitCode =?= 0)
 +
transfer_output_remaps = "<default_output_filename> = /home/<username>/jobs/<updated_output_path_and_filename>"
  
 
Queue 50</pre>
 
Queue 50</pre>
  
 +
Make sure that the <b>last line</b> in your submit file is "Queue <number>".
  
Most of the variables are self-explanatory. The <b>executable</b> is a path to the program binary or executable script. The shown use of the <b>requirements</b> variable is important here to constrain job assignment to Statistics Cluster nodes only. All available nodes are tagged with <i>ParallelSchedulingGroup</i> variable in the ClassAds, so this is an effective way to direct execution to particular cluster segments. Physics and Geophysics nodes are also available but they are much older than the statistics nodes and may not contain all the necessary libraries. The <b>output</b>, <b>error</b> and <b>log</b> create the respective records for each job numbered by Condor with the <i>$(Process)</i> variable. A detailed example of a job is available [http://gryphn.phys.uconn.edu/statswiki/index.php/Example_Jobs here]. If your job requires input from another file, the following can be added above the output line:
+
Most of the variables are self-explanatory. The <b>executable</b> is a path to the program binary or executable script. The shown use of the <b>requirements</b> variable is important here to constrain job assignment to Statistics Cluster nodes only. All available nodes are tagged with <i>ParallelSchedulingGroup</i> variable in the ClassAds, so this is an effective way to direct execution to particular cluster segments. The <b>output</b>, <b>error</b> and <b>log</b> create the respective records for each job numbered by Condor with the <i>$(Process)</i> variable. A detailed example of a job is available [http://gryphn.phys.uconn.edu/statswiki/index.php/Example_Jobs here].
 +
 
 +
If your job requires input from another file, the following can be added above the output line:
 
<pre>
 
<pre>
 
input = input.file
 
input = input.file
Line 37: Line 88:
 
where <i>input.file</i> is the name of your file. It is also implied that the file is in the same directory as the submit file.
 
where <i>input.file</i> is the name of your file. It is also implied that the file is in the same directory as the submit file.
  
The <b>universe</b> option in the submission file specifies the condor runtime environment. Vanilla is the simplest runtime environment and executes a single-core program inside a single job slot. Multi-core and multi-processor jobs can be scheduled using the parallel universe. See the Condor documentation for more details on scheduling jobs in the parallel universe.
+
The <b>universe</b> option in the submission file specifies the condor runtime environment. Vanilla is the simplest runtime environment and executes a single-core program inside a single job slot. Multi-core and multi-processor jobs can be scheduled using the parallel universe. For jobs requiring multiple cores, change <b>request_cpus</b> to the desired number. Note that the more cores you request the longer you may have to wait for a machine to become available with the resources you request. See the Condor documentation for more details on scheduling jobs in the parallel universe.
  
 
For optimal allocation of resources, <b><i>serial jobs ought to be submitted to Condor as well</i></b>. This is accomplished by omitting the number of job instances leaving only the directive <i>Queue</i> in the last line of the job description file outlined above. Obviously, <i>$(Process)</i> placeholder is no longer necessary since there will be no enumeration of output files.
 
For optimal allocation of resources, <b><i>serial jobs ought to be submitted to Condor as well</i></b>. This is accomplished by omitting the number of job instances leaving only the directive <i>Queue</i> in the last line of the job description file outlined above. Obviously, <i>$(Process)</i> placeholder is no longer necessary since there will be no enumeration of output files.
  
== Jobs Beyond the Statistics Cluster ==
+
=== AccountingGroup Example ===
 +
A simple, testjob group description file goes as follows:
 +
 
 +
<pre>Requirements = ParallelSchedulingGroup == "stats group"
 +
+AccountingGroup = "group_statistics_testjob.username"
 +
Universe  = vanilla
 +
Executable = myprog
 +
Arguments = $(Process)
 +
request_cpus = 1
 +
 
 +
output    = myprog-$(Process).out
 +
error    = myprog-$(Process).err
 +
Log      = myprog.log
 +
 
 +
transfer_input_files = myprog
 +
should_transfer_files = YES
 +
when_to_transfer_output = ON_EXIT
 +
on_exit_remove = (ExitCode =?= 0)
 +
transfer_output_remaps = "<default_output_filename> = /home/<username>/jobs/<updated_output_path_and_filename>"
 +
 
 +
Queue 50</pre>
 +
 
 +
Remember to replace ".username" with your stats cluster username. This sample submit script can be used for shortjob and longjob groups by replacing "testjob" with either "shortjob" or "longjob".
 +
 
 +
=== GCC 4.9.2 ===
 +
CentOS 6 uses the default 4.4.7 GCC compiler. The version 4.9.2 is available but it needs to be set by the user. To do this, it is recommended to include it in your job's executable. For example, you could make /bin/bash your executable and then transfer an executable bash script. Within this bash script, you can set gcc to 4.9.2 and then execute your code.
 +
 
 +
The submit file
 +
<pre>
 +
...
 +
Executable = /bin/bash
 +
Arguments = myBashScript
 +
# if your script takes in arguments, write it like below
 +
Arguments = myBashScript arg1 arg2 ...
 +
transfer_input_files = myBashScript
 +
...
 +
</pre>
 +
 
 +
myBashScript (make sure this is executable, chmod +x myBashScript)
 +
<pre>
 +
# At the very beginning of your script you should add this line
 +
source scl_source enable devtoolset-3
 +
 
 +
# If you want to convince yourself that you are now using gcc 4.9.2, add the following line to get the gcc version in your output file
 +
gcc --version
 +
 
 +
# Now include the necessary commands in the remaining bash script to execute your code
 +
exe="root"
 +
opt1="-l"
 +
opt2="-b"
 +
macro="runDSelector.C(\"$1\")"
 +
 
 +
command=( "$exe" "$opt1" "$opt2" "$macro" )
 +
 
 +
"${command[@]}"
 +
</pre>
 +
 
 +
This is just an example bash script that will open the software ROOT and execute a macro called runDSelector.C with a solitary argument. This is not the only way to structure a bash script.
 +
 
 +
== Some guidelines ==
 +
=== Memory ===
 +
If you find that your job is being held, it's possible that your job is going over its memory (resident set) quota. This can be checked by examining your log file and seeing how much disk and memory used compared to what was requested. If the disk usage far exceeds the requested amount, you are likely thrashing the cluster by using swap (hard disk) instead of memory.
 +
 
 +
To request more memory for your job, add the following line to your submit file
 +
 
 +
<pre>
 +
request_memory = <size in MB>
 +
</pre>
 +
 
 +
If you had a job that was held because of a memory quota issue, you should try testing it before submitting a large batch. Add the amount of hard disk you used to your initial memory request and do a test run. Once you're satisfied that you're within the request limits, submit a full set of jobs.
  
If you would like to submit jobs beyond the statistics cluster, please e-mail the administrators for permission. Failure to do so will result in disabling your account due to a violation of terms of use.
+
=== Large job queues ===
 +
Never submit more than 5,000 jobs at once. The cluster can only negotiate so many queued jobs. Overloading the queue will prevent the resource manager from properly negotiating the resources.
  
To use the physics and geophysics cluster resources, it is important to set the <b>Requirements</b> carefully. Omitting <i>(ParallelSchedulingGroup == "stats group")</i> is insufficient because Condor presumes that the submitted executable can only run on the architecture from which the job is launched. This includes the distinction between x86 64 and 32 bit machines (the latter is still common on physics and geophysics cluster segments.) To insist that both architectures be used, include a requirement: <i>(Arch == "INTEL" || Arch == "X86_64")</i>
+
Remember, other users might also submit a large set of jobs. Try to keep smaller batches.

Latest revision as of 23:10, 24 February 2017

Job policy

ClassAds

The Statistics Cluster is equipped with a powerful job queuing system called Condor. This framework provides efficient use of resources by matching user needs to the available resources by taking into account both the priorities for the hardware and the preferences of the job. Matching resource requests to resource offers is accomplished through the ClassAds mechanism. Each virtual machine publishes its parameters as a kind of classified advertisement to attract jobs. A job submitted to Condor for scheduling may list its requirements and preferences.

AccountingGroup

There are 4 types of jobs that can be submitted to the cluster. These are:

Job Type Resource Quota Maximum Runtime Line Required in Submit File
standardjob 450 24 hours default, no additional line in submit file
longjob 250 48 hours +AccountingGroup = "group_statistics_longjob.username"
shortjob 250 8 hours +AccountingGroup = "group_statistics_shortjob.username"
testjob 50 20 minutes +AccountingGroup = "group_statistics_testjob.username"
  • When jobs are submitted to the cluster, Condor will assign resources to jobs to satisfy the resource quotas of the group. If 1000 jobs of each group are submitted, each group should have met its resource quota and the remaining jobs will sit waiting for the next resource.
  • If a group submits more jobs than their quota, the surplus jobs will be regrouped with all other surplus jobs. These jobs will receive a resource based on user priority, as explained in the next section.
  • To prevent users from holding onto resources, maximum runtimes are enforced. When a job has gone beyond it's maximum runtime, a job in the queue has the potential to preempt the overtime job.
  • Jobs may also be preempted if one group is over quota and new jobs from a different group are submitted. The new group is able to preempt the extra jobs up to the new group's quota and without reducing the running group's quota.


Users are expected to adjust their jobs to meet these run time requirements.

User Priority

When jobs are submitted, Condor must allocate available resources to the requesting users. In addition to adhering to the Accounting Groups, it does so by using a value called userprio (user priority). The lower the value of userprio the higher the priority for that user. For example, a user with userprio 5 has a higher priority than a user with userprio 50. The share of available machines that a user should be allocated is continuously calculated by Condor and changes based on the resource use of the individual. If a user has more machines allocated than the userprio, then the value will worsen by increasing over time. If a user has less machines allocated than the userprio, then it will improve by decreasing over time. This is how Condor fairly distributes machine resources to users.

On the stats cluster, each student and faculty member are given a specific priority factor of 1000. This is used to calculate the effective priority of a user. Any non-UConn user of the cluster has a priority factor of 2000 so that priority is given to UConn users. As users claim machines their effective priority will adjust accordingly.


Job Submission

Submit File

Jobs are submitted with the condor_submit command with a job description file passed as an argument.

condor_submit myprog.condor

A simple, standard group description file goes as follows:

Requirements = ParallelSchedulingGroup == "stats group"
Universe  = vanilla
Executable = myprog
Arguments = $(Process)
request_cpus = 1

output    = myprog-$(Process).out
error     = myprog-$(Process).err
Log       = myprog.log

transfer_input_files = myprog
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
on_exit_remove = (ExitCode =?= 0)
transfer_output_remaps = "<default_output_filename> = /home/<username>/jobs/<updated_output_path_and_filename>"

Queue 50

Make sure that the last line in your submit file is "Queue <number>".

Most of the variables are self-explanatory. The executable is a path to the program binary or executable script. The shown use of the requirements variable is important here to constrain job assignment to Statistics Cluster nodes only. All available nodes are tagged with ParallelSchedulingGroup variable in the ClassAds, so this is an effective way to direct execution to particular cluster segments. The output, error and log create the respective records for each job numbered by Condor with the $(Process) variable. A detailed example of a job is available here.

If your job requires input from another file, the following can be added above the output line:

input = input.file

where input.file is the name of your file. It is also implied that the file is in the same directory as the submit file.

The universe option in the submission file specifies the condor runtime environment. Vanilla is the simplest runtime environment and executes a single-core program inside a single job slot. Multi-core and multi-processor jobs can be scheduled using the parallel universe. For jobs requiring multiple cores, change request_cpus to the desired number. Note that the more cores you request the longer you may have to wait for a machine to become available with the resources you request. See the Condor documentation for more details on scheduling jobs in the parallel universe.

For optimal allocation of resources, serial jobs ought to be submitted to Condor as well. This is accomplished by omitting the number of job instances leaving only the directive Queue in the last line of the job description file outlined above. Obviously, $(Process) placeholder is no longer necessary since there will be no enumeration of output files.

AccountingGroup Example

A simple, testjob group description file goes as follows:

Requirements = ParallelSchedulingGroup == "stats group"
+AccountingGroup = "group_statistics_testjob.username"
Universe  = vanilla
Executable = myprog
Arguments = $(Process)
request_cpus = 1

output    = myprog-$(Process).out
error     = myprog-$(Process).err
Log       = myprog.log

transfer_input_files = myprog
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
on_exit_remove = (ExitCode =?= 0)
transfer_output_remaps = "<default_output_filename> = /home/<username>/jobs/<updated_output_path_and_filename>"

Queue 50

Remember to replace ".username" with your stats cluster username. This sample submit script can be used for shortjob and longjob groups by replacing "testjob" with either "shortjob" or "longjob".

GCC 4.9.2

CentOS 6 uses the default 4.4.7 GCC compiler. The version 4.9.2 is available but it needs to be set by the user. To do this, it is recommended to include it in your job's executable. For example, you could make /bin/bash your executable and then transfer an executable bash script. Within this bash script, you can set gcc to 4.9.2 and then execute your code.

The submit file

...
Executable = /bin/bash
Arguments = myBashScript
# if your script takes in arguments, write it like below
Arguments = myBashScript arg1 arg2 ...
transfer_input_files = myBashScript
...

myBashScript (make sure this is executable, chmod +x myBashScript)

# At the very beginning of your script you should add this line
source scl_source enable devtoolset-3

# If you want to convince yourself that you are now using gcc 4.9.2, add the following line to get the gcc version in your output file
gcc --version

# Now include the necessary commands in the remaining bash script to execute your code
exe="root"
opt1="-l"
opt2="-b"
macro="runDSelector.C(\"$1\")"

command=( "$exe" "$opt1" "$opt2" "$macro" )

"${command[@]}"

This is just an example bash script that will open the software ROOT and execute a macro called runDSelector.C with a solitary argument. This is not the only way to structure a bash script.

Some guidelines

Memory

If you find that your job is being held, it's possible that your job is going over its memory (resident set) quota. This can be checked by examining your log file and seeing how much disk and memory used compared to what was requested. If the disk usage far exceeds the requested amount, you are likely thrashing the cluster by using swap (hard disk) instead of memory.

To request more memory for your job, add the following line to your submit file

request_memory = <size in MB>

If you had a job that was held because of a memory quota issue, you should try testing it before submitting a large batch. Add the amount of hard disk you used to your initial memory request and do a test run. Once you're satisfied that you're within the request limits, submit a full set of jobs.

Large job queues

Never submit more than 5,000 jobs at once. The cluster can only negotiate so many queued jobs. Overloading the queue will prevent the resource manager from properly negotiating the resources.

Remember, other users might also submit a large set of jobs. Try to keep smaller batches.