Jobs and Slots, where jobs are run
HTCondor refers to any single computing task as a “job” specified by several attributes, such as executable, input and output files etc. Moreover, each computing node or worker of the cluster has a determined amount of resources (cpus, gpus, memory..) that can be split in “slots“ where jobs are run. As job scheduler, HTCondor is then continously reviewing jobs attributes and slots availabilities and matching jobs to slots.
Simple jobs
Simple jobs are jobs using CPUs within a single machine. All other jobs are described in dedicated pages.
Detailed documentation is available at https://htcondor.readthedocs.io/en/latest/users-manual/submitting-a-job.html. You will find below the useful instructions to submit simple jobs.
Use a shared filesystem:
/lapp_data
for LAPP/LAPTH users/univ_home
for all other users
Specify an HTCondor Universe: A universe in HTCondor defines the execution environment of a job. HTCondor supports several universe but the vanilla universe is intented for most programs and shell scripts. It should be used for all the local jobs (except MPI jobs) which are submitted on the MUST cluster, as specified in the submit description file below.
Create a test.sh
file with execution permissions (chmod 777 test.sh
), for example:
#! /bin/sh
echo "HT-condor testprog01"
echo "I'm process id $$ on" `hostname`
echo "This is sent to standard error" 1>&2
date
echo "Running as binary $0" "$@"
echo "My name (argument 1) is $1"
echo "My sleep duration (argument 2) is $2"sleep $2echo "Sleep of $2 seconds finished. Exiting"
exit 0
Create the associated test.submit
description file:
# To specify the name of the script to run
executable=test.sh
# The following line is mandatory and all the local jobs (except MPI jobs) should be part of the vanilla universe.
# For MPI jobs, please refer to https://doc.must-datacentre.fr/batch/hpc/.
universe=vanilla
# To add parameters
arguments=Example.$(Cluster).$(Process) 100
output=results.output.$(Process)
error=results.error.$(Process)
log=results.log
# To receive a notification e-mail
#notification=never
notification=complete
notify_user=<your_e-mail>
# To use your environment in the job
getenv = True
# To give some environment variables to the job
#environment = one=1;two=2;three="quotes have no 'special' meaning"
# To specify queue NUMBER with the number of jobs to be launched simultaneously
queue
Run the test:
condor_submit test.submit