Difference between revisions of "How to Monitor Jobs"
Jump to navigation
Jump to search
Line 10: | Line 10: | ||
This will display | This will display | ||
− | *the process ID | + | *ID: the process ID |
− | *the owner of the job | + | *OWNER: the owner of the job |
− | *the date and time it was submitted | + | *SUBMITTED: the date and time it was submitted |
− | *how long it has been running | + | *RUN_TIME: how long it has been running |
− | *its current status (run, held, idle) | + | *ST: its current status (run R, held H , idle I) |
− | *the job size | + | *SIZE: the job size |
− | *program name | + | *CMD: program name |
This is useful to monitor your own jobs to check on their status. | This is useful to monitor your own jobs to check on their status. | ||
Line 30: | Line 30: | ||
Another useful command is <i>condor_status</i> which can tell you information about the cluster machines | Another useful command is <i>condor_status</i> which can tell you information about the cluster machines | ||
<pre> | <pre> | ||
− | [username@computer ~]$ condor_status [-r] | less | + | [username@computer ~]$ condor_status [-r] | grep stat | less |
</pre> | </pre> | ||
Line 36: | Line 36: | ||
If there are any concerns about a specific job, please contact the main administrator. | If there are any concerns about a specific job, please contact the main administrator. | ||
+ | |||
+ | == Check the job as it runs == | ||
+ | === Check stdout === | ||
+ | If a job is running you can execute the following command to see the tail of the stdout file | ||
+ | |||
+ | <pre> | ||
+ | condor_tail <job_id> | ||
+ | </pre> | ||
+ | |||
+ | === Check stderr === | ||
+ | If you want to see if there are any errors you can run the following | ||
+ | |||
+ | <pre> | ||
+ | condor_tail -no-stdout -stderr <job_id> | ||
+ | </pre> |
Revision as of 15:43, 3 November 2016
Check the state of the job
Check the jobs of a specific user
Once jobs have been submitted to the cluster, monitoring can be performed using the following command in a terminal
[username@computer ~]$ condor_q -submitter <username>| less
This will display
- ID: the process ID
- OWNER: the owner of the job
- SUBMITTED: the date and time it was submitted
- RUN_TIME: how long it has been running
- ST: its current status (run R, held H , idle I)
- SIZE: the job size
- CMD: program name
This is useful to monitor your own jobs to check on their status.
Check all jobs
If you want to see all of the jobs in the queue
condor_q | less
Check which machine the job is running on
Another useful command is condor_status which can tell you information about the cluster machines
[username@computer ~]$ condor_status [-r] | grep stat | less
This will show a list of the various machine resources and if the option -r is supplied it will only show machines with running jobs.
If there are any concerns about a specific job, please contact the main administrator.
Check the job as it runs
Check stdout
If a job is running you can execute the following command to see the tail of the stdout file
condor_tail <job_id>
Check stderr
If you want to see if there are any errors you can run the following
condor_tail -no-stdout -stderr <job_id>