NVIDIA CUDA on Proteus
Hardware
- Eight nodes (gpu01 -- gpu08), each with
- dual NVIDIA K20Xm - Kepler architecture, Tesla microarchitecture, GK110 die[1][2] -- 2688 CUDA cores, 6144 MB RAM
Installed Version
NVIDIA CUDA 9.0 is installed on all GPU nodes.
Other NVIDIA Libraries
- NCCL - located in /usr/local/cuda/nccl
Compile Options
Architecture
Compile for native compute capability target 3.5 architecture
(sm_35
).[3]
-gencodeĀ arch=compute_35,code=sm_35
If the software expects a CUDA_ARCH
environment variable, use:
CUDA_ARCH=35
C Compiler
CUDA will not work with the Intel Compiler. Please use GCC: the modulefile gcc/4.8.1 that is loaded by default will work.
Hybrid MPI with CUDA
See https://www.ccv.brown.edu/doc/mixing-mpi-and-cuda.html for an example.
NOTES
- This may or may not improve the performance of your code. Benchmarking your own code is necessary.
- Simply using mpicc to compile a CUDA-enabled code is not enough to generate an executable that does both CUDA and MPI. The code has to be specifically written to integrate CUDA with MPI.
Running CUDA Jobs
See: GPU Jobs on Proteus
PyCUDA
PyCUDA[4][5] is a Python interface to CUDA.
To Use
PyCUDA is installed on the GPU nodes as part of the python37 conda environment.
- First, load the
python/anaconda3
module:module load python/anaconda3
- Then, activate the
python37
environment:conda activate python37
References
[1] NVIDIA Tesla K-Series Overview (PDF)
[2] TechPowerUp Hardware Database: NVIDIA K20Xm
[3] NVIDIA CUDA Toolkit 6.0 Documentation - Kepler Tuning Guide
[5] PyCUDA webpage