Installing TensorFlow 2.11.0 using pip and venv
We follow the official instructions for installation via pip, except we use a
pre-installed Python via the modulefile python/gcc/3.10
, and we use
Python virtual environments (venv
)[1][2] instead of miniconda (or
Anaconda).
N.B. TensorFlow from pip supports CPU-only and GPUs. There is no need to
install both "tensorflow
" and "tensorflow-gpu
" packages: they are
identical.
Requirements
Listed requirements (and check support matrix for each):
- CUDA driver >= 450.80.02
- CUDA Toolkit 11.2
- CuDNN 8.1.0
- Optional: TensorRT to improve latency and throughput for inference
N.B. Using the above listed requirements will result in warning messages about not being able to find certain library (shared object) files:
2022-12-05 07:15:48.158455: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvrtc.so.11.1: cannot open shared object file: No such file or directory; ...
Actual requirements because TF pip package from PyPI links to
CUDA 11.1:
- CUDA driver >= 450.80.02
- CUDA Toolkit 11.1
- CuDNN 8.x
- TensorRT 7.2
UPDATE 2023-01-10 Looks like they fixed the TF PyPI packages.
Interactive session on GPU node
Run an interactive shell on a GPU node:
[juser@picotte001 ~]$ srun -p gpu --gpus-per-node=1 --mem-per-gpu=16G --cpus-per-gpu=12 --time=2:00:00 --pty /bin/bash
[juser@gpu005 ~]$
Check number of GPUs assigned: should see only one, with id number 0
:
[juser@gpu005 ~]$ nvidia-smi
Mon Feb 27 17:15:18 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.39.01 Driver Version: 510.39.01 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla V100-SXM2... On | 00000000:18:00.0 Off | 0 |
| N/A 37C P0 41W / 300W | 0MiB / 32768MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Load requirements
Set up environment and load appropriate modulefiles:
[juser@gpu005 ~]$ module use /ifs/opt_cuda/modulefiles
[juser@gpu005 ~]$ module load python/gcc/3.10
[juser@gpu005 ~]$ module load cuda11.2/toolkit cuda11.2/blas cuda11.2/fft tensorrt-cuda11.2/7.2.3.4 cudnn8.7-cuda11.2 cutensor-cuda11.2
Set up Python virtual environment (venv)
[juser@gpu005 ~]$ cd /ifs/groups/myrsrchGrp
[juser@gpu005 myrsrchGrp]$ mkdir venvs
[juser@gpu005 myrsrchGrp]$ python3 -m venv ./venvs/py310-tf211
[juser@gpu005 myrsrchGrp]$ source ./venvs/py310-tf211/bin/activate
(py310-tf211) [juser@gpu005 myrsrchGrp]$
Note the change of prompt: the venv name "(py310-tf211)
" is added.
Next, check that the venv is active by looking at the location of the
python3
executable.
(py310-tf211) [juser@gpu005 myrsrchGrp]$ which python3
/ifs/groups/myrsrchGrp/venvs/py310-tf211/bin/python3
Update pip
and setuptools
(because there is a critical setuptools
security fix):
(py310-tf211) [juser@gpu005 myrsrchGrp]$ python3 -m pip install -U pip setuptools
Requirement already satisfied: pip in ./venvs/py310-tf211/lib/python3.10/site-packages (22.2.2)
Collecting pip
Using cached pip-22.3.1-py3-none-any.whl (2.1 MB)
Requirement already satisfied: setuptools in ./venvs/py310-tf211/lib/python3.10/site-packages (63.2.0)
Collecting setuptools
Using cached setuptools-65.6.3-py3-none-any.whl (1.2 MB)
Installing collected packages: setuptools, pip
Attempting uninstall: setuptools
Found existing installation: setuptools 63.2.0
Uninstalling setuptools-63.2.0:
Successfully uninstalled setuptools-63.2.0
Attempting uninstall: pip
Found existing installation: pip 22.2.2
Uninstalling pip-22.2.2:
Successfully uninstalled pip-22.2.2
Successfully installed pip-22.3.1 setuptools-65.6.3
Install TensorFlow
Install TensorFlow 2.11.0 using pip:
(py310-tf211) [juser@gpu005 myrsrchGrp]$ python3 -m pip install tensorflow==2.11.0
Collecting tensorflow==2.11.0
Downloading tensorflow-2.11.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (588.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 588.3/588.3 MB 3.7 MB/s eta 0:00:00
...
Installing collected packages: tensorboard-plugin-wit, pyasn1, libclang, flatbuffers, wrapt, wheel, urllib3, typing-extensions, termcolor, tensorflow-io-gcs-filesystem, tensorflow-estimator, tensorboard-data-server, six, rsa, pyparsing, pyasn1-modules, protobuf, oauthlib, numpy, MarkupSafe, markdown, keras, idna, grpcio, gast, charset-normalizer, certifi, cachetools, absl-py, werkzeug, requests, packaging, opt-einsum, h5py, google-pasta, google-auth, astunparse, requests-oauthlib, google-auth-oauthlib, tensorboard, tensorflow
Successfully installed MarkupSafe-2.1.1 absl-py-1.3.0 astunparse-1.6.3 cachetools-5.2.0 certifi-2022.9.24 charset-normalizer-2.1.1 flatbuffers-22.11.23 gast-0.4.0 google-auth-2.15.0 google-auth-oauthlib-0.4.6 google-pasta-0.2.0 grpcio-1.51.1 h5py-3.7.0 idna-3.4 keras-2.11.0 libclang-14.0.6 markdown-3.4.1 numpy-1.23.5 oauthlib-3.2.2 opt-einsum-3.3.0 packaging-21.3 protobuf-3.19.6 pyasn1-0.4.8 pyasn1-modules-0.2.8 pyparsing-3.0.9 requests-2.28.1 requests-oauthlib-1.3.1 rsa-4.9 six-1.16.0 tensorboard-2.11.0 tensorboard-data-server-0.6.1 tensorboard-plugin-wit-1.8.1 tensorflow-2.11.0 tensorflow-estimator-2.11.0 tensorflow-io-gcs-filesystem-0.28.0 termcolor-2.1.1 typing-extensions-4.4.0 urllib3-1.26.13 werkzeug-2.2.2 wheel-0.38.4 wrapt-1.14.1
Test TensorFlow
Run a simple one-line test to create a random 1000x1000 tensor and
perform a reduce_sum()
:
(py310-tf211) [juser@gpu005 myrsrchGrp]$ python3 -c "import tensorflow as tf; print(tf.__version__); print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
2023-01-10 18:31:18.187484: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-01-10 18:31:18.290904: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2.11.0
2023-01-10 18:31:24.326718: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-01-10 18:31:24.840312: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1613] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 30972 MB memory: -> device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:3b:00.0, compute capability: 7.0
tf.Tensor(354.97415, shape=(), dtype=float32)
The result tf.Tensor(354.97415, shape=(), dtype=float32)
will be
different for you since it is a random tensor.
For interactive use, remember to deactivate the venv once you are done with TensorFlow:
(py310-tf211) [juser@gpu005 myrsrchGrp]$ deactivate
[juser@gpu005 myrsrchGrp]$ which python3
[juser@gpu005 myrsrchGrp]$ /ifs/opt/python/gcc/3.10.2/bin/python3
Note that the prompt loses the "(py310-tf211)
" tag.
Job scripts
Job scripts will need to set up the same environment before running the Python script.
Example Python script
Create and save this file as "test_tf.py
":
#!/usr/bin/env python3
import tensorflow as tf
print(tf.__version__)
print(tf.reduce_sum(tf.random.normal([1000, 1000])))
Example job script
Create a job script named "tf_job.sh
" to run the above TensorFlow
computation in the same directory as the above test_tf.py
file:
#!/bin/bash
#SBATCH --partition=gpu
#SBATCH --gpus-per-node=1
#SBATCH --cpus-per-gpu=12
#SBATCH --mem-per-gpu=40G
#SBATCH --time=0:15:00
module use /ifs/opt_cuda/modulefiles
module load python/gcc/3.10
module load cuda11.1/toolkit cuda11.1/blas cuda11.1/fft cudnn8.0-cuda11.1 tensorrt-cuda11.1/7.2.3.4
# activate TF venv
source /ifs/groups/myrsrchGrp/venvs/py310-tf211/bin/activate
python3 test_tf.py
Submit the job:
[juser@picotte001 ~]$ sbatch tf_job.sh
The output will be in a file named "slurm-NNNNNNN.out
" where
"NNNNNNN
" is the job ID. Its contents should be something like:
2022-12-05 07:51:42.721183: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-12-05 07:51:42.841257: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2.11.0
2022-12-05 07:51:44.995935: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-12-05 07:51:45.495916: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1613] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 30972 MB memory: -> device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:86:00.0, compute capability: 7.0
tf.Tensor(213.94812, shape=(), dtype=float32)
CAUTIONS
XLA_FLAGS environment variable
For some TensorFlow applications, an environment variable may need to be set:
export XLA_FLAGS="--xla_gpu_cuda_data_dir=/cm/shared/apps/cuda11.2"
Do it in your job script, before the line that runs your TF code.
Missing libdevice.10.bc
Despite environment variables set correctly to define the path to the
CUDA Toolkit installation, TF can have trouble finding a library file
libdevice.10.bc
.
The workaround is to copy it to the same directory as your Python TF script:
[juser@gpu001 ~]$ cp $CUDA_DIR/nvvm/libdevice/libdevice.10.bc .
Examples
- Slurm - Job Script Example 08 TensorFlow using virtualenv
- Slurm - Job Script Example 08a TensorFlow multi-GPU using virtualenv