Contents
Available queues
Queue Name | Allowed submittors | Restrictions |
---|---|---|
batch (default) | Math faculty/grads | Max 8GB memory (2GB default) |
faculty | Math faculty | Max 16GB memory (4GB default) |
short | Math faculty/grads | Max 8GB memory (2GB default) and 4 day walltime (2 day default) |
bigmem | Math faculty/grads | Max 50GB memory (8GB default) |
Specifying memory requirements
The pmem parameter is used to specify the amount of virtual memory that a job will need. By default, most queues will have a reasonable limit already set for you. If you won't need more then this, you won't need to do anything extra. If the default is not appropriate for your job though, you may have to specify a different amount.
bigmem is a queue for really large memory jobs. Only one such job can run at a time so your jobs may take a while before it is able to run. To use the bigmem queue you will need to have the following in your .pbs file.
#PBS -q bigmem
#PBS -l nodes=1:ppn=1,pvmem=16gb
PBS file parameters
Common qsub Torque commands for submitting jobs to the cluster. These can be added to your PBS file by adding the prefix #PBS before each parameter listed. Normally you do not need to use all of these.
Torque Parameter | Description |
---|---|
-V | Export all enviroment variables to the job. |
-N jobname | Set the name for your job. |
-o outfile | Set the output file name for your job. |
-e errorfile | Set the error file name for your job. |
-q queue_name | Specify the queue to submit your job to. batch is the defaut queue. |
-l nodes=#:ppn=# | Request the number of cpu cores you need for your job. The total number of cpu cores is (num nodes) x (processors per node). We have a maximum of 8 cpu cores available on any single node. |
-l walltime=HH:MM:SS | Maximum amount of time to allow job to run. Some queues have a maximum amount of time allowed. Jobs exceeding their maximum runtime are terminated so don't set this too short. |
-l pvmem=#mb | Maximum amount of virutal memory to allow job to use in megabytes. Use gb to specify in gigabytes. Most queues have maximum or minimum memory requirements set. |
-t [0-9,-] | Request an array job. Argument can be a comma and/or hyphenated list of digits. Ex. -t 0-5,11,14-16 will run an array job with PBS_ARRAYID values 0,1,2,3,4,5,11,14,15,16. |
-m bea | Send mail to user when job begins (b), ends (e), and aborts (a). Used with the -M. |
-M email_address | Specify your email address. |
-j oe | Merge standard output and error streams into one. |
-k oe | Force real time output and error streams. One can view these with tail -f <filename> |
Note: By default output stream is located in $HOME/<jobname>.o<jobnum> and error stream is located in $HOME/<jobname>.e<jobnum>.
See man pbs_resources for more details and examples.
Some example PBS files
Simple PBS file
#PBS -N Matlab_Job1
#PBS -S /bin/sh
#PBS -l nodes=1:ppn=1:matlab,walltime=10:00:00
#PBS -M mypid@math.vt.edu
#PBS -m bea
cd $HOME/matlab
matlab -nodisplay -r job1 >& job1.out
Note: Runs a Matlab job using job1.m with output from Mablab going to $HOME/matlab/job1.out. Job will run on a single processor core for 10 hours.
MPI Job PBS file
#PBS -N MPI_Job1
#PBS -S /bin/sh
#PBS -l nodes=3:ppn=1:walltime=01:00:00
#PBS -M mypid@math.vt.edu
#PBS -m bea
cd $HOME/mpi
export NP=`wc -l ${PBS_NODEFILE} | cut -d' ' -f1`
export MPDSNP=`uniq ${PBS_NODEFILE} |wc -l| cut -d' ' -f1`
cat ${PBS_NODEFILE} | uniq > /tmp/mpd_nodefile_${USER}_$$
export MPD_NODEFILE=/tmp/mpd_nodefile_${USER}_$$
mpdboot -v -n ${MPDSNP} -f ${MPD_NODEFILE}
mpdtrace -l
mpirun -np ${NP} myprogram
mpdallexit
Note: MPI needs to know what nodes your job will be running on. This is pulled out of the $PBS_NODEFILE which is created dynamically by the Torque resource manager when your jobs runs. The above PBS file should run the program called myprogram on 3 nodes for 1 hour using a single processor core on each node. In some cases the scheduler may use processor cores on the same node (ignoring the nodes specification). The only workaround would be to specify a memory requirement in addition so that a node could only satisfy a single task.