Back to Main FAQ

Discovery Cluster

Where does the content of the terminal (standard out and standard error) go for a scheduled job (non-interactive)?

After the job starts, two files are created in the directory you submitted the jobs form. Those files are named, STDIN.o<job-id> and STDIN.e<job-id>

  • The .e file will contain errors that the job generates (STD ERR).
  • The .o file contains that output of the job along with prologue and epilogue information (STD OUT).
  • STDIN is the name of the job since the qsub (mksub for DartFS) received the commands from STanDard INput.
  • The prologue shows requested resources and the epilogue shows received resources.
How much memory do my jobs get? How do I assign more memory for a job?

Each ‘core’ comes with 4GB RAM (cells E-K) and 8GB RAM on cell M.

In you PBS script, you can specify the number of nodes and cores your job will require:

#PBS -l nodes=1:ppn=4

In this example, my job will be assign one node and 4 cores => 4x4GB=16GB RAM if it runs on cells E-K and 4x8GB=32GB on cell M.

Note that even you job does not need more than 1 core, but need more RAM, you must request the appropriate number of cores which might remain unused.

What are some of the available scheduler commands?
  • qsub (mksub for DartFS) ps_script_filename — submit job 
  • myjobs [-rn] — view job(s) status
  • qshow [-r] — view queue status
  • pbsmon — view nodes & status
  • checkjob [-v] jobID — view job(s) status
  • qr — view your resources
  • qdel jobID  — remove job
  • notify — notify near run end
What is an example of a PBS script?
#!/bin/bash -l
# declare a name for this job
#PBS -N myFirstJob
# request the queue (enter the possible names, if omitted, default is the default)
# if more then 600 jobs use the largeq
#PBS -q default
# request 1 core on 1 node
# ensure you reserve enough cores for the projected memory usage
# figuring 4GB/core
#PBS -l nodes=1:ppn=1
# request 4 hours and 30 minutes of wall time
#PBS -l walltime=04:30:00
# mail is sent to you when the job begins and when it exits or aborts
# you can use all or some or none.  If you don't want email leave this
# and the following (#PBS -M) out of the script.
#PBS -m bea
# specify your email address
#PBS -M John.Smith@dartmouth.edu
# By default, PBS scripts execute in your home directory, not the
# directory from which they were submitted. The following line
# places you in the directory from which the job was submitted.
cd $PBS_O_WORKDIR
# run the program
./program_name arg1 arg2 ...
How do I test my script/code? How do I estimate the walltime? I do I run interactive jobs?

important, as it’ll determine your place in the queue and avoid bugs

use one of the 3 test nodes: x01, x02, x03

use “tnodeload” to chose the least busy test node

 

from discovery:

$ ssh x01

time your job:

$ time yourExecutableScript -p param1 -q param2

do not forget to ‘exit’ your test node SSH session

How do I run interactive jobs?

We only recommend to run interactive jobs on Discovery on the test nodes: x01, x02, and x03

DO NOT RUN INTERACTIVE JOBS ON THE DISCOVERY HEADNODE!!!

You can SSH into any node to measure it’s load, but do not run jobs on them directly, as it will interfere with the scheduler.

For interactive jobs, please use Andes and Polaris, which are shared resources meant to be used interactively.

What do the ‘qsub’ mean? Where can I find ‘qsub’ documentation?

On Discovery:

man qsub

or through this link: qsub online documentation

Python

What Version of Python Should I Use on the HPC Systems?

There are multiple versions of python available on the HPC systems. There is the system version installed in /usr/bin/python. This is an older version of python, it does not have a lot of packages installed with it and it does not use any high performance libraries. If you are doing any amount of any amount of data processing or scientific computing, we recommend that you use the Anaconda distribution of Python.

You can select the version of python to use by loading one of the python modules. Use the command: module avail python to see a list of the python modules available on the system and then use the module load command to load a particular module. Here is an examploe of how to list the module and then load one.

$ module avail python

------------------------ /opt/Modules/3.2.9/modulefiles ------------------------
python/2.7-Anaconda(default) python/2.7.8
python/2.7-CPU               python/2.7.9
python/2.7-GPU               python/3.4.3
python/2.7-QIIME2            python/3.5-Anaconda
python/2.7.10                python/3.6-GPU
python/2.7.11                python/3.6-Miniconda
python/2.7.6

$ module load python

The python 2.7 Anaconda is the default and you do not need to specify the version number if you select the default. To select any other version include the version number. For example: module load python/3.5-Anaconda

Can RC install some python package for me?

Users sometimes ask us to install a Python package in one of the versions of Python installed on our systems. If the package is one that would be useful to a numbers of our users and is one that is actively maintained and easy to install, we will install it in one of our Python versions.

If the package is useful to only a small number of users, we suggest that you install it in your own user space. A good way to do it to create a virtual environment for it using the Anaconda Python distribution. See this Python FAQ for information on how to do install a package in an Anaconda virtual environment.

If you are not using an Anaconda python distribution, you can use the pip command to install packages to your own directory. Here is  the command to install a package locally with pip:

pip install --user package_name

Contact us if you need help.

How to create a Conda virtual environment for python packages

The Anaconda python distribution allows you to create virtual environments which is a good way to manage python packages for different projects. First, you need to load one of Anaconda python modules and then use the conda tool to create and activate a virtual environment. Here is an example of how to create an environment, activate it and install packages.

module load python
conda create -n myenv
source activate myenv
conda install package1
conda install package2

To leave an activated environment, use:

source deactivate

Here are some useful conda commands:

List the packages in the current conda environment:

conda list

Search conda channels for a particular package:

conda search package_name

List all of your conda environments:

conda env list

Here is the Conda Cheat Sheet

If the package that you need is not any Anaconda channels, you can use the pip command to install the package into your virtual environment:

pip install --user package_name

Here is the Anaconda web page on managing packages with conda.

Shared Systems, Andes and Polaris

No FAQs Found

Virtual Machines and Cloud Systems

No FAQs Found

Back to Main FAQ