- In this lab you will set up a PBS script that will allow you to run your program on the cluster via the queue.
- So open a new file called myFirstJob.pbs in an editor.
- The very first line of a script should always start with the following:
- This ensures that the correct shell is used and that your login scripts are invoked for predefined module definitions and aliases you may want when the program is run.
- Now lets define the scheduler account you will use.
- The qr command will list the account(s) you are a member of. (These are case-sensitive)
- This next line defines what queue you’ll use. You have three choices.
- default – This is the main queue where jobs are run from
- largeq – If you need to queue up more then 600 jobs, you’ll want this one
- testq – There are 16 cores on the test nodes that are reserved for short test runs
#PBS -q default
- Now you’ll need a name for your job.
- This name will be used in the output files for the job.
#PBS -N myFirstJob
- You most likely would like to be notified when the job ends or aborts.
- So add the following two lines.
- The first one specifies your email address and the second specifies when you want to receive email.
- b = when the job begins
- e = when the job ends
- a = when the job aborts
- Mostly you’ll want to know when it ends or aborts, and may want to know when it begins if the cluster or your account usage is very busy.
#PBS -M <your-email-address>
#PBS -m ea
- Now lets define how many cores you’ll need.
- Since this job only needs one, you specify it as follows
#PBS -l nodes=1:ppn=1
- This means that you need a single node with at least 1 processor core available.
- Next we’ll designate how much walltime we’ll need.
- The default is only 1 hour, so we’ll always set this up.
- The value uses this syntax
- We’ll set it for 1 hour and 35 minutes.
#PBS -l walltime=1:35:00
- The first thing you want the job to do is cd to the directory we submitted it from.
- By default it will run from your HOME directory.
- Note: that is a capital letter O in the middle of the variable name.
- Now you can add the commands that will run your process(es).
- This lab will have you run a program called shapley2 on some data.
- Add the following line.
- Now save your editing session and exit the editor.
- The next step is to simply submit the job to the queue.
- When you invoke this command it will respond with something like the following:
- The numeric portion of that output is your job number that you can use with the commands checkjob and showstart if the job doesn’t start immediately.
- The text portion is the name of the scheduler server.
- Once the job starts to run, an output file called
myFirstJob.o1418689will show up in your submittal directory.
- You can cat or tail this file to watch the progress of your job.
- If the job outputs an error, there will be a second file show up called
- If there are no errors the file will show up, but will be empty.