Using the GPU Nodes

Discovery has 12 GPU systems available to users; g01-g12. Each has two K80 GPU boards.

Here is a sample submit a script that will submit a job to run on one of the GPU nodes.

There are several applications and libraries that allow you to use run your programs on a GPU:

- CUDA: NVIDIA’s parallel programming environment that lets you add code to your C/C++ programs so that they can use an NVIDIA GPU.
  - - You need to load the CUDA module before you can use it.
    - The CUDA SDK with CUDA examples is installed in /opt/cuda/active/samples.
  - The cublas and cufft libraries are installed with CUDA and let your BLAS and FFT codes run on the GPU.
- CUDA Fortran is part of the Portland Group Fortran compilers installed on discovery.
  - The module for the Portland Group compiler is installed by default in your discovery user environment but you will need to load the CUDA module in order to use CUDA Fortran.
- Open ACC directives are part of the Portland Group C/C++ and Fortran compilers that you allow to add directives (like OpenMP) to your code and run your program in parallel on a GPU.
- TotalView Debugger: this GUI-based debugger can be used to debug your CUDA and OpenACC programs on discovery.
  - The TotalView debugger module is loaded into your discovery environment and can be accessed by typing ‘TotalView’ in your terminal window.

PyCUDA is a Python module that lets your Python program run on the GPU. Load the Python module to access it.