###################################################################### ## ## condor_config ## ## This is the global configuration file for condor. ## ## This version is for submit-only machines. Please use another ## file if you have different needs by default. ## ## The file is divided into three main parts: settings you must ## customize, settings you may want to customize, and settings you ## should probably leave alone (unless you know what you're doing) ## ## Please read the INSTALL file (or the Install chapter in the ## Condor Administrator's Manual) for detailed explanations of the ## various settings in here and possible ways to configure your ## pool. ## ## Unless otherwise specified, settings that are commented out show ## the defaults that are used if you don't define a value. Settings ## that are defined here MUST BE DEFINED since they have no default ## value. ## ## Unless otherwise indicated, all settings which specify a time are ## defined in seconds. ## ###################################################################### ###################################################################### ###################################################################### ## Settings you must customize: ###################################################################### ###################################################################### ## What machine is your central manager? CONDOR_HOST = central-manager-hostname.your.domain ##-------------------------------------------------------------------- ## Pathnames: ##-------------------------------------------------------------------- ## Where have you installed the bin, sbin and lib condor directories? RELEASE_DIR = /usr/local/condor ## Where is the local condor directory for each host? LOCAL_DIR = $(TILDE) #LOCAL_DIR = $(RELEASE_DIR)/hosts/$(HOSTNAME) ## Where is the machine-specific local config file for each host? LOCAL_CONFIG_FILE = $(LOCAL_DIR)/condor_config.local #LOCAL_CONFIG_FILE = $(RELEASE_DIR)/etc/$(HOSTNAME).local ##-------------------------------------------------------------------- ## Mail parameters: ##-------------------------------------------------------------------- ## When something goes wrong with condor at your site, who should get ## the email? CONDOR_ADMIN = condor-admin@your.domain ## Full path to a mail delivery program that understands that "-s" ## means you want to specify a subject: MAIL = /usr/bin/mail ##-------------------------------------------------------------------- ## Network domain parameters: ##-------------------------------------------------------------------- ## Internet domain of machines sharing a common UID space. If your ## machines don't share a common UID space, use the second entry ## which specifies that each machine has its own UID space. UID_DOMAIN = your.domain #UID_DOMAIN = $(FULL_HOSTNAME) ## Internet domain of machines sharing a common file system. ## If your machines don't use a network file system, use the second ## entry which specifies that each machine has its own file system. FILESYSTEM_DOMAIN = your.domain #FILESYSTEM_DOMAIN = $(FULL_HOSTNAME) ###################################################################### ###################################################################### ## Settings you may want to customize: ## (it is generally safe to leave these untouched) ###################################################################### ###################################################################### ##-------------------------------------------------------------------- ## Flocking: Submitting jobs to more than one pool ##-------------------------------------------------------------------- ## If you would like to run your jobs in other pools, add the central ## manager hosts for those pools here. You must also add ## $(FLOCK_HOSTS) to ALLOW_NEGOTIATOR_SCHEDD below. #FLOCK_HOSTS = condor.friendly.domain, condor.cs.wisc.edu ##-------------------------------------------------------------------- ## Host/IP access levels ##-------------------------------------------------------------------- ## Please see the administrator's manual for details on these ## settings, what they're for, and how to use them. ## What machines have administrative rights for your pool? This ## defaults to your central manager. You should set it to the ## machine(s) where whoever is the condor administrator(s) works ## (assuming you trust all the users who log into that/those ## machine(s), since this is machine-wide access you're granting). ALLOW_ADMINISTRATOR = $(CONDOR_HOST) ## If there are no machines that should have administrative access ## to your pool (for example, there's no machine where only trusted ## users have accounts), you can uncomment this setting. ## Unfortunately, this will mean that administering your pool will ## be more difficult. #DENY_ADMINISTRATOR = * ## Read access. Machines listed as allow (and/or not listed as deny) ## can view the status of your pool, etc. ## NOTE: By default, without either of these entries specified, you ## are granting read access to the whole world. You may want to ## restrict that to hosts in your domain. If possible, please also ## grant read access to "*.cs.wisc.edu", so the Condor developers ## will be able to view the status of your pool and more easily help ## you install, configure or debug your Condor installation. #ALLOW_READ = *.your.domain, *.cs.wisc.edu #DENY_READ = *.bad.subnet, bad-machine.your.domain, 144.77.88.* ## Write access. Machines listed here can join your pool, submit ## jobs, etc. #ALLOW_WRITE = *.your.domain, your-friend's-machine.other.domain #DENY_WRITE = bad-machine.your.domain ## Negotiator access. Machines listed here are trusted central ## managers. ALLOW_NEGOTIATOR = $(CONDOR_HOST) #ALLOW_NEGOTIATOR_SCHEDD = $(CONDOR_HOST), $(FLOCK_HOSTS) ##-------------------------------------------------------------------- ## Network filesystem parameters: ##-------------------------------------------------------------------- ## Do you want to use NFS for file access instead of remote system ## calls? #USE_NFS = False ## Do you want to use AFS for file access instead of remote system ## calls? #USE_AFS = False ##-------------------------------------------------------------------- ## Checkpoint server: ##-------------------------------------------------------------------- ## Do you want to use a checkpoint server? #USE_CKPT_SERVER = False ## If so, what's the hostname? #CKPT_SERVER_HOST = checkpoint-server-hostname.your.domain ## Do you want the starter on the execute machine to choose the ## checkpoint server? If False, the CKPT_SERVER_HOST set on ## the submit machine is used. The default is false. #STARTER_CHOOSES_CKPT_SERVER = False ##-------------------------------------------------------------------- ## Miscellaneous: ##-------------------------------------------------------------------- ## Try to save this much swap space by not starting new shadows. ## Specified in megabytes. #RESERVED_SWAP = 5 ## What's the maximum number of jobs you want a single submit machine ## to spawn shadows for? #MAX_JOBS_RUNNING = 200 ## Condor needs to create a few lock files to synchronize access to ## various log files. Because of problems we've had with network ## filesystems and file locking over the years, we HIGHLY recommend ## that you put these lock files on a local partition on each ## machine. If you don't have your LOCAL_DIR on a local partition, ## be sure to change this entry. Whatever user (or group) condor is ## running as needs to have write access to this directory. If ## you're not running as root, this is whatever user you started up ## the condor_master as. If you are running as root, and there's a ## condor account, it's probably condor. Otherwise, it's whatever ## you've set in the CONDOR_IDS environment variable. See the Admin ## manual for details on this. LOCK = $(LOG) ## If you don't use a fully qualified name in your /etc/hosts file ## (or NIS, etc.) for either your official hostname or as an alias, ## Condor wouldn't normally be able to use fully qualified names in ## places that it'd like to. You can set this parameter to the ## domain you'd like appended to your hostname, if changing your host ## information isn't a good option. This parameter must be set in ## the global config file (not the LOCAL_CONFIG_FILE from above). #DEFAULT_DOMAIN_NAME = your.domain.name ## If you don't have DNS set up, Condor will normally fail in many ## places because it can't resolve hostnames to IP addresses and ## vice-versa. If you enable this option, Condor will use ## pseudo-hostnames constructed from a machine's IP address and the ## DEFAULT_DOMAIN_NAME. Both NO_DNS and DEFAULT_DOMAIN must set in ## your top-level config file for this mode of operation to work ## properly. #NO_DNS = True ## Condor can be told whether or not you want the Condor daemons to ## create a core file if something really bad happens. This just ## sets the resource limit for the size of a core file. By default, ## we don't do anything, and leave in place whatever limit was in ## effect when you started the Condor daemons. If this parameter is ## set and "True", we increase the limit to as large as it gets. If ## it's set to "False", we set the limit at 0 (which means that no ## core files are even created). Core files greatly help the Condor ## developers debug any problems you might be having. #CREATE_CORE_FILES = True ## This setting tells Condor whether to delegate or copy GSI X509 ## credentials when sending them over the wire between daemons. ## Delegation can take up to a second, which is very slow when ## submitting a large number of jobs. Copying exposes the credential ## to third parties if Condor isn't set to encrypt communications. ## By default, Condor will delegate rather than copy. DELEGATE_JOB_GSI_CREDENTIALS = True ##-------------------------------------------------------------------- ## Settings that control the daemon's debugging output: ##-------------------------------------------------------------------- ## ## The flags given in ALL_DEBUG are shared between all daemons. ## #ALL_DEBUG = #MAX_SCHEDD_LOG = $(MAX_DEFAULT_LOG) #SCHEDD_DEBUG = D_PID #MAX_SHADOW_LOG = $(MAX_DEFAULT_LOG) #SHADOW_DEBUG = #MAX_MASTER_LOG = $(MAX_DEFAULT_LOG) #MASTER_DEBUG = ## When the master starts up, should it truncate it's log file? #TRUNC_MASTER_LOG_ON_OPEN = False ## The daemons touch their log file periodically, even when they have ## nothing to write. When a daemon starts up, it prints the last time ## the log file was modified. This lets you estimate when a previous ## instance of a daemon stopped running. This paramete controls often ## the daemons touch the file (in seconds). TOUCH_LOG_INTERVAL = 60 ###################################################################### ###################################################################### ## Settings you should probably leave alone: ## (unless you know what you're doing) ###################################################################### ###################################################################### ###################################################################### ## Daemon-wide settings: ###################################################################### ## Pathnames LOG = $(LOCAL_DIR)/log SPOOL = $(LOCAL_DIR)/spool BIN = $(RELEASE_DIR)/bin LIB = $(RELEASE_DIR)/lib SBIN = $(RELEASE_DIR)/sbin LIBEXEC = $(RELEASE_DIR)/libexec HISTORY = $(SPOOL)/history ## Log files MASTER_LOG = $(LOG)/MasterLog SCHEDD_LOG = $(LOG)/SchedLog SHADOW_LOG = $(LOG)/ShadowLog ## Lock files SHADOW_LOCK = $(LOCK)/ShadowLock ## This setting primarily allows you to change the port that the ## collector is listening on. By default, the collector uses port ## 9618, but you can set the port with a ":port", such as: ## COLLECTOR_HOST = $(CONDOR_HOST):1234 COLLECTOR_HOST = $(CONDOR_HOST) ## The NEGOTIATOR_HOST parameter has been deprecated. The port where ## the negotiator is listening is now dynamically allocated and the IP ## and port are now obtained from the collector, just like all the ## other daemons. However, if your pool contains any machines that ## are running version 6.7.3 or earlier, you can uncomment this ## setting to go back to the old fixed-port (9614) for the negotiator. #NEGOTIATOR_HOST = $(CONDOR_HOST) ## What machine will be the condor_view server for this pool? #CONDOR_VIEW_HOST = full.hostname.domain ## How long are you willing to let daemons try their graceful ## shutdown methods before they do a hard shutdown? (30 minutes) #SHUTDOWN_GRACEFUL_TIMEOUT = 1800 ## How much disk space would you like reserved from Condor? In ## places where Condor is computing the free disk space on various ## partitions, it subtracts the amount it really finds by this ## many megabytes. (If undefined, defaults to 0). RESERVED_DISK = 5 ## If your machine is running AFS and the AFS cache lives on the same ## partition as the other Condor directories, and you want Condor to ## reserve the space that your AFS cache is configured to use, set ## this to true. #RESERVE_AFS_CACHE = False ###################################################################### ## Daemon-specific settings: ###################################################################### ##-------------------------------------------------------------------- ## condor_master ##-------------------------------------------------------------------- ## Daemons you want the master to keep running for you: DAEMON_LIST = MASTER, SCHEDD ## Where are the binaries for these daemons? MASTER = $(SBIN)/condor_master SCHEDD = $(SBIN)/condor_schedd ## When the master starts up, it can place it's address (IP and port) ## into a file. This way, tools running on the local machine don't ## need to query the central manager to find the master. This ## feature can be turned off by commenting out this setting. MASTER_ADDRESS_FILE = $(LOG)/.master_address ## What name do you want to use for this master? #MASTER_NAME = username@$(FULL_HOSTNAME) ## Where should the master find the condor_preen binary? If you don't ## want preen to run at all, just comment out this setting. PREEN = $(SBIN)/condor_preen ## How do you want preen to behave? The "-m" means you want email ## about files preen finds that it thinks it should remove. The "-r" ## means you want preen to actually remove these files. If you don't ## want either of those things to happen, just remove the appropriate ## one from this setting. PREEN_ARGS = -m -r ## How often should the master start up condor_preen? (once a day) #PREEN_INTERVAL = 86400 ## If a daemon dies an unnatural death, do you want email about it? #PUBLISH_OBITUARIES = True ## If you're getting obituaries, how many lines of the end of that ## daemon's log file do you want included in the obituary? #OBITUARY_LOG_LENGTH = 20 ## Should the master run? #START_MASTER = True ## Should the master start up the daemons you want it to? #START_DAEMONS = True ## How often do you want the master to send an update to the central ## manager? #MASTER_UPDATE_INTERVAL = 300 ## How often do you want the master to check the timestamps of the ## daemons it's running? If any daemons have been modified, the ## master restarts them. #MASTER_CHECK_NEW_EXEC_INTERVAL = 300 ## Once you notice new binaries, how long should you wait before you ## try to execute them? #MASTER_NEW_BINARY_DELAY = 120 ## What's the maximum amount of time you're willing to give the ## daemons to quickly shutdown before you just kill them outright? #SHUTDOWN_FAST_TIMEOUT = 120 ###### ## Exponential backoff settings: ###### ## When a daemon keeps crashing, we use "exponential backoff" so we ## wait longer and longer before restarting it. This is the base of ## the exponent used to determine how long to wait before starting ## the daemon again: #MASTER_BACKOFF_FACTOR = 2.0 ## What's the maximum amount of time you want the master to wait ## between attempts to start a given daemon? (With 2.0 as the ## MASTER_BACKOFF_FACTOR, you'd hit 1 hour in 12 restarts...) #MASTER_BACKOFF_CEILING = 3600 ## How long should a daemon run without crashing before we consider ## it "recovered". Once a daemon has recovered, we reset the number ## of restarts so the exponential backoff stuff goes back to normal. #MASTER_RECOVER_FACTOR = 300 ##-------------------------------------------------------------------- ## condor_schedd ##-------------------------------------------------------------------- ## Where are the various shadow binaries installed? SHADOW = $(SBIN)/condor_shadow ## When the schedd starts up, it can place it's address (IP and port) ## into a file. This way, tools running on the local machine don't ## need to query the central manager to find the schedd. This ## feature can be turned off by commenting out this setting. SCHEDD_ADDRESS_FILE = $(SPOOL)/.schedd_address ## What name do you want to use for this schedd? #SCHEDD_NAME = username@$(FULL_HOSTNAME) ## How often should the schedd send an update to the central manager? #SCHEDD_INTERVAL = 300 ## How long should the schedd wait between spawning each shadow? #JOB_START_DELAY = 2 ## How often should the schedd send a keep alive message to any ## startds it has claimed? (5 minutes) #ALIVE_INTERVAL = 300 ## This setting controls the maximum number of times that a ## condor_shadow processes can have a fatal error (exception) before ## the condor_schedd will simply relinquish the match associated with ## the dying shadow. #MAX_SHADOW_EXCEPTIONS = 5 ## Estimated virtual memory size of each condor_shadow process. ## Specified in kilobytes. SHADOW_SIZE_ESTIMATE = 1800 ## The condor_schedd can renice the condor_shadow processes on your ## submit machines. How how "nice" do you want the shadows? (1-19). ## The higher the number, the lower priority the shadows have. ## This feature can be disabled entirely by commenting it out. SHADOW_RENICE_INCREMENT = 10 ## By default, when the schedd fails to start an idle job, it will ## not try to start any other idle jobs in the same cluster during ## that negotiation cycle. This makes negotiation much more ## efficient for large job clusters. However, in some cases other ## jobs in the cluster can be started even though an earlier job ## can't. For example, the jobs' requirements may differ, because of ## different disk space, memory, or operating system requirements. ## Or, machines may be willing to run only some jobs in the cluster, ## because their requirements reference the jobs' virtual memory size ## or other attribute. Setting NEGOTIATE_ALL_JOBS_IN_CLUSTER to True ## will force the schedd to try to start all idle jobs in each ## negotiation cycle. This will make negotiation cycles last longer, ## but it will ensure that all jobs that can be started will be ## started. #NEGOTIATE_ALL_JOBS_IN_CLUSTER = False ###### ## Queue management settings: ###### ## How often should the schedd truncate it's job queue transaction ## log? (Specified in seconds, once a day is the default.) #QUEUE_CLEAN_INTERVAL = 86400 ## How often should the schedd commit "wall clock" run time for jobs ## to the queue, so run time statistics remain accurate when the ## schedd crashes? (Specified in seconds, once per hour is the ## default. Set to 0 to disable.) #WALL_CLOCK_CKPT_INTERVAL = 3600 ## What users do you want to grant super user access to this job ## queue? (These users will be able to remove other user's jobs). ## By default, this only includes root. QUEUE_SUPER_USERS = root, condor ##-------------------------------------------------------------------- ## condor_shadow ##-------------------------------------------------------------------- ## If the shadow is unable to read a checkpoint file from the ## checkpoint server, it keeps trying only if the job has accumulated ## more than MAX_DISCARDED_RUN_TIME seconds of CPU usage. Otherwise, ## the job is started from scratch. Defaults to 1 hour. This ## setting is only used if USE_CKPT_SERVER (from above) is True. #MAX_DISCARDED_RUN_TIME = 3600 ##-------------------------------------------------------------------- ## condor_submit ##-------------------------------------------------------------------- ## If you want condor_submit to automatically append an expression to ## the Requirements expression or Rank expression of jobs at your ## site, uncomment these entries. #APPEND_REQUIREMENTS = (expression to append job requirements) #APPEND_RANK = (expression to append job rank) ## If you want expressions only appended for either standard or ## vanilla universe jobs, you can uncomment these entries. If any of ## them are defined, they are used for the given universe, instead of ## the generic entries above. #APPEND_REQ_VANILLA = (expression to append to vanilla job requirements) #APPEND_REQ_STANDARD = (expression to append to standard job requirements) #APPEND_RANK_STANDARD = (expression to append to vanilla job rank) #APPEND_RANK_VANILLA = (expression to append to standard job rank) ## This can be used to define a default value for the rank expression ## if one is not specified in the submit file. #DEFAULT_RANK = (default rank expression for all jobs) ## If you want universe-specific defaults, you can use the following ## entries: #DEFAULT_RANK_VANILLA = (default rank expression for vanilla jobs) #DEFAULT_RANK_STANDARD = (default rank expression for standard jobs) ## If you want condor_submit to automatically append expressions to ## the job ClassAds it creates, you can uncomment and define the ## SUBMIT_ATTRS setting. It works just like the STARTD_ATTRS ## described above with respect to ClassAd vs. config file syntax, ## strings, etc. One common use would be to have the full hostname ## of the machine where a job was submitted placed in the job ## ClassAd. You would do this by uncommenting the following lines: #Machine = "$(FULL_HOSTNAME)" #SUBMIT_ATTRS = Machine ## Condor keeps a buffer of recently-used data for each file an ## application opens. This macro specifies the default maximum number ## of bytes to be buffered for each open file at the executing ## machine. #DEFAULT_IO_BUFFER_SIZE = 524288 ## Condor will attempt to consolidate small read and write operations ## into large blocks. This macro specifies the default block size ## Condor will use. #DEFAULT_IO_BUFFER_BLOCK_SIZE = 32768 ##-------------------------------------------------------------------- ## condor_preen ##-------------------------------------------------------------------- ## Who should condor_preen send email to? #PREEN_ADMIN = $(CONDOR_ADMIN) ## What files should condor_preen leave in the spool directory? VALID_SPOOL_FILES = job_queue.log, job_queue.log.tmp, history, \ Accountant.log, Accountantnew.log, \ local_univ_execute, .quillwritepassword, \ .pgpass, \ .schedd_address, .schedd_address.super, .schedd_classad ## What files should condor_preen remove from the log directory? INVALID_LOG_FILES = core ## ##-------------------------------------------------------------------- ## Condor-G settings ##-------------------------------------------------------------------- ## Where is the GridManager binary installed? GRIDMANAGER = $(SBIN)/condor_gridmanager GT2_GAHP = $(SBIN)/gahp_server GRID_MONITOR = $(SBIN)/grid_monitor ##-------------------------------------------------------------------- ## Settings that control the daemon's debugging output: ##-------------------------------------------------------------------- ## ## Note that the Gridmanager runs as the User, not a Condor daemon, so ## all users must have write permssion to the directory that the ## Gridmanager will use for it's logfile. Our suggestion is to create a ## directory called GridLogs in $(LOG) with UNIX permissions 1777 ## (just like /tmp ) ## Another option is to use /tmp as the location of the GridManager log. ## MAX_GRIDMANAGER_LOG = $(MAX_DEFAULT_LOG) GRIDMANAGER_DEBUG = GRIDMANAGER_LOG = $(LOG)/GridmanagerLog.$(USERNAME) GRIDMANAGER_LOCK = $(LOCK)/GridmanagerLock.$(USERNAME) ##-------------------------------------------------------------------- ## Various other settings that the Condor-G can use. ##-------------------------------------------------------------------- ## For grid-type gt2 jobs (pre-WS GRAM), limit the number of jobmanager ## processes the gridmanager will let run on the headnode. Letting too ## many jobmanagers run causes severe load on the headnode. GRIDMANAGER_MAX_JOBMANAGERS_PER_RESOURCE = 10 ## If we're talking to a Globus 2.0 resource, Condor-G will use the new ## version of the GRAM protocol. The first option is how often to check the ## proxy on the submit site of things. If the GridManager discovers a new ## proxy, it will restart itself and use the new proxy for all future ## jobs launched. In seconds, and defaults to 10 minutes #GRIDMANAGER_CHECKPROXY_INTERVAL = 600 ## The GridManager will shut things down 3 minutes before loosing Contact ## because of an expired proxy. ## In seconds, and defaults to 3 minutes #GRDIMANAGER_MINIMUM_PROXY_TIME = 180 ## Condor requires that each submitted job be designated to run under a ## particular "universe". ## ## If no universe is specificed in the submit file, Condor must pick one ## for the job to use. By default, it chooses the "vanilla" universe. ## The default can be overridden in the config file with the DEFAULT_UNIVERSE ## setting, which is a string to insert into a job submit description if the ## job does not try and define it's own universe ## #DEFAULT_UNIVERSE = vanilla # # The Cred_min_time_left is the first-pass at making sure that Condor-G # does not submit your job without it having enough time left for the # job to finish. For example, if you have a job that runs for 20 minutes, and # you might spend 40 minutes in the queue, it's a bad idea to submit with less # than an hour left before your proxy expires. # 2 hours seemed like a reasonable default. # CRED_MIN_TIME_LEFT = 120 ## ## The location of the wrapper for invoking ## Condor GAHP server ## CONDOR_GAHP = $(SBIN)/condor_c-gahp ## ## The Condor GAHP server has it's own log. Like the Gridmanager, the ## GAHP server is run as the User, not a Condor daemon, so all users must ## have write permssion to the directory used for the logfile. Our ## suggestion is to create a directory called GridLogs in $(LOG) with ## UNIX permissions 1777 (just like /tmp ) ## Another option is to use /tmp as the location of the CGAHP log. ## MAX_C_GAHP_LOG = $(MAX_DEFAULT_LOG) #C_GAHP_LOG = $(LOG)/GridLogs/CGAHPLog.$(USERNAME) C_GAHP_LOG = /tmp/CGAHPLog.$(USERNAME) C_GAHP_LOCK = /tmp/CGAHPLock.$(USERNAME) C_GAHP_WORKER_THREAD_LOG = /tmp/CGAHPWorkerLog.$(USERNAME) C_GAHP_WORKER_THREAD_LOCK = /tmp/CGAHPWorkerLock.$(USERNAME) ## ## Location of the PBS/LSF gahp and its associated binaries ## GLITE_LOCATION = $(LIB)/glite PBS_GAHP = $(GLITE_LOCATION)/bin/batch_gahp LSF_GAHP = $(GLITE_LOCATION)/bin/batch_gahp ## ## The location of the wrapper for invoking the Unicore GAHP server ## UNICORE_GAHP = $(SBIN)/unicore_gahp ## ## The location of the wrapper for invoking the NorduGrid GAHP server ## NORDUGRID_GAHP = $(SBIN)/nordugrid_gahp