Installing TensorFlow 2.11.0 using pip and venv
We follow the official instructions for installation via pip, except we use a
pre-installed Python via the modulefile python/gcc/3.10
, and we use
Python virtual environments (venv
)[1][2] instead of miniconda (or
Anaconda).
N.B. TensorFlow from pip supports CPU-only and GPUs. There is no need to
install both "tensorflow
" and "tensorflow-gpu
" packages: they are
identical.
Requirements♯
Listed requirements (and check support matrix for each):
- CUDA driver >= 450.80.02
- CUDA Toolkit 11.2
- CuDNN 8.1.0
- Optional: TensorRT to improve latency and throughput for inference
N.B. Using the above listed requirements will result in warning messages about not being able to find certain library (shared object) files:
2022-12-05 07:15:48.158455: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvrtc.so.11.1: cannot open shared object file: No such file or directory; ...
Actual requirements because TF pip package from PyPI links to
CUDA 11.1:
- CUDA driver >= 450.80.02
- CUDA Toolkit 11.1
- CuDNN 8.x
- TensorRT 7.2
UPDATE 2023-01-10 Looks like they fixed the TF PyPI packages.
Interactive session on GPU node♯
Run an interactive shell on a GPU node:
[juser@picotte001 ~]$ srun -p gpu --gpus-per-node=1 --mem-per-gpu=16G --cpus-per-gpu=12 --time=2:00:00 --pty /bin/bash
[juser@gpu005 ~]$
Check number of GPUs assigned: should see only one, with id number 0
:
[juser@gpu005 ~]$ nvidia-smi
Mon Feb 27 17:15:18 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.39.01 Driver Version: 510.39.01 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla V100-SXM2... On | 00000000:18:00.0 Off | 0 |
| N/A 37C P0 41W / 300W | 0MiB / 32768MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Load requirements♯
Set up environment and load appropriate modulefiles:
[juser@gpu005 ~]$ module use /ifs/opt_cuda/modulefiles
[juser@gpu005 ~]$ module load python/gcc/3.10
[juser@gpu005 ~]$ module load cuda11.2/toolkit cuda11.2/blas cuda11.2/fft tensorrt-cuda11.2/7.2.3.4 cudnn8.7-cuda11.2 cutensor-cuda11.2
Set up Python virtual environment (venv)♯
[juser@gpu005 ~]$ cd /ifs/groups/myrsrchGrp
[juser@gpu005 myrsrchGrp]$ mkdir venvs
[juser@gpu005 myrsrchGrp]$ python3 -m venv ./venvs/py310-tf211
[juser@gpu005 myrsrchGrp]$ source ./venvs/py310-tf211/bin/activate
(py310-tf211) [juser@gpu005 myrsrchGrp]$
Note the change of prompt: the venv name "(py310-tf211)
" is added.
Next, check that the venv is active by looking at the location of the
python3
executable.
(py310-tf211) [juser@gpu005 myrsrchGrp]$ which python3
/ifs/groups/myrsrchGrp/venvs/py310-tf211/bin/python3
Update pip
and setuptools
(because there is a critical setuptools
security fix):
(py310-tf211) [juser@gpu005 myrsrchGrp]$ python3 -m pip install -U pip setuptools
Requirement already satisfied: pip in ./venvs/py310-tf211/lib/python3.10/site-packages (22.2.2)
Collecting pip
Using cached pip-22.3.1-py3-none-any.whl (2.1 MB)
Requirement already satisfied: setuptools in ./venvs/py310-tf211/lib/python3.10/site-packages (63.2.0)
Collecting setuptools
Using cached setuptools-65.6.3-py3-none-any.whl (1.2 MB)
Installing collected packages: setuptools, pip
Attempting uninstall: setuptools
Found existing installation: setuptools 63.2.0
Uninstalling setuptools-63.2.0:
Successfully uninstalled setuptools-63.2.0
Attempting uninstall: pip
Found existing installation: pip 22.2.2
Uninstalling pip-22.2.2:
Successfully uninstalled pip-22.2.2
Successfully installed pip-22.3.1 setuptools-65.6.3
Install TensorFlow♯
Install TensorFlow 2.11.0 using pip:
(py310-tf211) [juser@gpu005 myrsrchGrp]$ python3 -m pip install tensorflow==2.11.0
Collecting tensorflow==2.11.0
Downloading tensorflow-2.11.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (588.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 588.3/588.3 MB 3.7 MB/s eta 0:00:00
...
Installing collected packages: tensorboard-plugin-wit, pyasn1, libclang, flatbuffers, wrapt, wheel, urllib3, typing-extensions, termcolor, tensorflow-io-gcs-filesystem, tensorflow-estimator, tensorboard-data-server, six, rsa, pyparsing, pyasn1-modules, protobuf, oauthlib, numpy, MarkupSafe, markdown, keras, idna, grpcio, gast, charset-normalizer, certifi, cachetools, absl-py, werkzeug, requests, packaging, opt-einsum, h5py, google-pasta, google-auth, astunparse, requests-oauthlib, google-auth-oauthlib, tensorboard, tensorflow
Successfully installed MarkupSafe-2.1.1 absl-py-1.3.0 astunparse-1.6.3 cachetools-5.2.0 certifi-2022.9.24 charset-normalizer-2.1.1 flatbuffers-22.11.23 gast-0.4.0 google-auth-2.15.0 google-auth-oauthlib-0.4.6 google-pasta-0.2.0 grpcio-1.51.1 h5py-3.7.0 idna-3.4 keras-2.11.0 libclang-14.0.6 markdown-3.4.1 numpy-1.23.5 oauthlib-3.2.2 opt-einsum-3.3.0 packaging-21.3 protobuf-3.19.6 pyasn1-0.4.8 pyasn1-modules-0.2.8 pyparsing-3.0.9 requests-2.28.1 requests-oauthlib-1.3.1 rsa-4.9 six-1.16.0 tensorboard-2.11.0 tensorboard-data-server-0.6.1 tensorboard-plugin-wit-1.8.1 tensorflow-2.11.0 tensorflow-estimator-2.11.0 tensorflow-io-gcs-filesystem-0.28.0 termcolor-2.1.1 typing-extensions-4.4.0 urllib3-1.26.13 werkzeug-2.2.2 wheel-0.38.4 wrapt-1.14.1
Test TensorFlow♯
Run a simple one-line test to create a random 1000x1000 tensor and
perform a reduce_sum()
:
(py310-tf211) [juser@gpu005 myrsrchGrp]$ python3 -c "import tensorflow as tf; print(tf.__version__); print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
2023-01-10 18:31:18.187484: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-01-10 18:31:18.290904: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2.11.0
2023-01-10 18:31:24.326718: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-01-10 18:31:24.840312: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1613] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 30972 MB memory: -> device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:3b:00.0, compute capability: 7.0
tf.Tensor(354.97415, shape=(), dtype=float32)
The result tf.Tensor(354.97415, shape=(), dtype=float32)
will be
different for you since it is a random tensor.
For interactive use, remember to deactivate the venv once you are done with TensorFlow:
(py310-tf211) [juser@gpu005 myrsrchGrp]$ deactivate
[juser@gpu005 myrsrchGrp]$ which python3
[juser@gpu005 myrsrchGrp]$ /ifs/opt/python/gcc/3.10.2/bin/python3
Note that the prompt loses the "(py310-tf211)
" tag.
Job scripts♯
Job scripts will need to set up the same environment before running the Python script.
Example Python script♯
Create and save this file as "test_tf.py
":
#!/usr/bin/env python3
import tensorflow as tf
print(tf.__version__)
print(tf.reduce_sum(tf.random.normal([1000, 1000])))
Example job script♯
Create a job script named "tf_job.sh
" to run the above TensorFlow
computation in the same directory as the above test_tf.py
file:
#!/bin/bash
#SBATCH --partition=gpu
#SBATCH --gpus-per-node=1
#SBATCH --cpus-per-gpu=12
#SBATCH --mem-per-gpu=40G
#SBATCH --time=0:15:00
module use /ifs/opt_cuda/modulefiles
module load python/gcc/3.10
module load cuda11.1/toolkit cuda11.1/blas cuda11.1/fft cudnn8.0-cuda11.1 tensorrt-cuda11.1/7.2.3.4
# activate TF venv
source /ifs/groups/myrsrchGrp/venvs/py310-tf211/bin/activate
python3 test_tf.py
Submit the job:
[juser@picotte001 ~]$ sbatch tf_job.sh
The output will be in a file named "slurm-NNNNNNN.out
" where
"NNNNNNN
" is the job ID. Its contents should be something like:
2022-12-05 07:51:42.721183: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-12-05 07:51:42.841257: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2.11.0
2022-12-05 07:51:44.995935: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-12-05 07:51:45.495916: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1613] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 30972 MB memory: -> device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:86:00.0, compute capability: 7.0
tf.Tensor(213.94812, shape=(), dtype=float32)
CAUTIONS♯
XLA_FLAGS environment variable♯
For some TensorFlow applications, an environment variable may need to be set:
export XLA_FLAGS="--xla_gpu_cuda_data_dir=/cm/shared/apps/cuda11.2"
Do it in your job script, before the line that runs your TF code.
Missing libdevice.10.bc♯
Despite environment variables set correctly to define the path to the
CUDA Toolkit installation, TF can have trouble finding a library file
libdevice.10.bc
.
The workaround is to copy it to the same directory as your Python TF script:
[juser@gpu001 ~]$ cp $CUDA_DIR/nvvm/libdevice/libdevice.10.bc .
Examples♯
- Slurm - Job Script Example 08 TensorFlow using virtualenv
- Slurm - Job Script Example 08a TensorFlow multi-GPU using virtualenv