Installing TensorFlow 2.9.3 using pip and venv
IN PROGRESS
We follow the official instructions for installation via pip, except we use a
pre-installed Python via the modulefile python/gcc/3.10
, and we use
Python virtual environments (venv
)[1][2] instead of miniconda (or
Anaconda).
N.B. TensorFlow from pip supports CPU-only and GPUs. There is no need to
install both "tensorflow
" and "tensorflow-gpu
" packages: they are
identical.
Requirements
Listed requirements (and check support matrix for each):
- CUDA driver >= 450.80.02
- CUDA Toolkit 11.2
- CuDNN 8.1.0
- Optional: TensorRT to improve latency and throughput for inference
N.B. Using the above listed requirements will result in warning messages about not being able to find certain library (shared object) files:
2022-12-05 07:15:48.158455: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvrtc.so.11.1: cannot open shared object file: No such file or directory; ...
Actual requirements because TF pip package from PyPI links to CUDA 11.1:
- CUDA driver >= 450.80.02
- CUDA Toolkit 11.1
- CuDNN 8.0
- TensorRT 7.2
Interactive session on GPU node
Run an interactive shell on a GPU node:
[juser@picotte001 ~]$ srun -p gpu --gres=gpu:1 --mem-per-gpu=16G --ntasks-per-gpu=12 --time=2:00:00 --pty /bin/bash
[juser@gpu005 ~]$
Load requirements
Set up environment and load appropriate modulefiles:
[juser@gpu005 ~]$ module use /ifs/opt_cuda/modulefiles
[juser@gpu005 ~]$ module load python/gcc/3.10
[juser@gpu005 ~]$ module load cuda11.1/toolkit cuda11.1/blas cuda11.1/fft cudnn8.0-cuda11.1 tensorrt-cuda11.1/7.2.3.4
Set up Python virtual environment (venv)
[juser@gpu005 ~]$ cd /ifs/groups/myrsrchGrp
[juser@gpu005 myrsrchGrp]$ mkdir venvs
[juser@gpu005 myrsrchGrp]$ python3 -m venv ./venvs/py310-tf29
[juser@gpu005 myrsrchGrp]$ source ./venvs/py310-tf29/bin/activate
(py310-tf29) [juser@gpu005 myrsrchGrp]$
Note the change of prompt: the venv name "(py310-tf29)
" is added.
Next, check that the venv is active by looking at the location of the
python3
executable.
(py310-tf29) [juser@gpu005 myrsrchGrp]$ which python3
/ifs/groups/myrsrchGrp/venvs/py310-tf29/bin/python3
Update pip
and setuptools
(py310-tf29) [juser@gpu005 myrsrchGrp]$ python3 -m pip install -U pip setuptools
Requirement already satisfied: pip in ./venvs/py310-tf29/lib/python3.10/site-packages (22.2.2)
Collecting pip
Using cached pip-22.3.1-py3-none-any.whl (2.1 MB)
Requirement already satisfied: setuptools in ./venvs/py310-tf29/lib/python3.10/site-packages (63.2.0)
Collecting setuptools
Using cached setuptools-65.6.3-py3-none-any.whl (1.2 MB)
Installing collected packages: setuptools, pip
Attempting uninstall: setuptools
Found existing installation: setuptools 63.2.0
Uninstalling setuptools-63.2.0:
Successfully uninstalled setuptools-63.2.0
Attempting uninstall: pip
Found existing installation: pip 22.2.2
Uninstalling pip-22.2.2:
Successfully uninstalled pip-22.2.2
Successfully installed pip-22.3.1 setuptools-65.6.3
Install TensorFlow
Install TensorFlow 2.9.3 using pip:
(py310-tf29) [juser@gpu005 myrsrchGrp]$ python3 -m pip install tensorflow==2.9.3
Downloading tensorflow-2.9.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (511.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 511.8/511.8 MB 4.5 MB/s eta 0:00:00
...
Installing collected packages: tensorboard-plugin-wit, pyasn1, libclang, keras, flatbuffers, wrapt, wheel, urllib3, typing-extensions, termcolor, tensorflow-io-gcs-filesystem, tensorflow-estimator, tensorboard-data-server, six, rsa, pyparsing, pyasn1-modules, protobuf, oauthlib, numpy, MarkupSafe, markdown, idna, grpcio, gast, charset-normalizer, certifi, cachetools, absl-py, werkzeug, requests, packaging, opt-einsum, keras-preprocessing, h5py, google-pasta, google-auth, astunparse, requests-oauthlib, google-auth-oauthlib, tensorboard, tensorflow
Successfully installed MarkupSafe-2.1.1 absl-py-1.3.0 astunparse-1.6.3 cachetools-5.2.0 certifi-2022.9.24 charset-normalizer-2.1.1 flatbuffers-1.12 gast-0.4.0 google-auth-2.15.0 google-auth-oauthlib-0.4.6 google-pasta-0.2.0 grpcio-1.51.1 h5py-3.7.0 idna-3.4 keras-2.9.0 keras-preprocessing-1.1.2 libclang-14.0.6 markdown-3.4.1 numpy-1.23.5 oauthlib-3.2.2 opt-einsum-3.3.0 packaging-21.3 protobuf-3.19.6 pyasn1-0.4.8 pyasn1-modules-0.2.8 pyparsing-3.0.9 requests-2.28.1 requests-oauthlib-1.3.1 rsa-4.9 six-1.16.0 tensorboard-2.9.1 tensorboard-data-server-0.6.1 tensorboard-plugin-wit-1.8.1 tensorflow-2.9.3 tensorflow-estimator-2.9.0 tensorflow-io-gcs-filesystem-0.28.0 termcolor-2.1.1 typing-extensions-4.4.0 urllib3-1.26.13 werkzeug-2.2.2 wheel-0.38.4 wrapt-1.14.1
Test TensorFlow
Run a simple one-line test to create a random 1000x1000 tensor and
perform a reduce_sum()
:
(py310-tf29) [juser@gpu005 myrsrchGrp]$ python3 -c "import tensorflow as tf; print(tf.__version__); print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
2022-12-05 08:26:36.522369: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2.9.3
2022-12-05 08:26:41.863370: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-12-05 08:26:42.414230: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 30988 MB memory: -> device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:18:00.0, compute capability: 7.0
tf.Tensor(1753.4398, shape=(), dtype=float32)
The result tf.Tensor(1753.4398, shape=(), dtype=float32) will be different for you since it is a random tensor.
Deactivate venv
For interactive use, remember to deactivate the venv once you are done with TensorFlow:
(py310-tf29) [juser@gpu005 myrsrchGrp]$ deactivate
[juser@gpu005 myrsrchGrp]$ which python3
[juser@gpu005 myrsrchGrp]$ /ifs/opt/python/gcc/3.10.2/bin/python3
Note that the prompt loses the "(py310-tf29)
" tag.
Job scripts
Job scripts will need to set up the same environment before running the Python script.
Example Python script
Create and save this file as "test_tf.py
":
#!/usr/bin/env python3
import tensorflow as tf
print(tf.__version__)
print(tf.reduce_sum(tf.random.normal([1000, 1000])))
Example job script
Create a job script named "tf_job.sh
" to run the above TensorFlow
computation in the same directory as the above test_tf.py
file:
#!/bin/bash
#SBATCH --partition=gpu
#SBATCH --gres=gpu:1
#SBATCH --cpus-per-gpu=12
#SBATCH --mem-per-gpu=40G
#SBATCH --time=0:15:00
module use /ifs/opt_cuda/modulefiles
module load python/gcc/3.10
module load cuda11.1/toolkit cuda11.1/blas cuda11.1/fft cudnn8.0-cuda11.1 tensorrt-cuda11.1/7.2.3.4
# activate TF venv
source /ifs/groups/myrsrchGrp/venvs/py310-tf29/bin/activate
python3 test_tf.py
Submit the job:
[juser@picotte001 ~]$ sbatch tf_job.sh
The output will be in a file named "slurm-NNNNNNN.out
" where
"NNNNNNN
" is the job ID. Its contents should be something like:
2022-12-05 08:28:05.942714: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2022-12-05 08:28:09.232840: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-12-05 08:28:09.698834: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 30988 MB memory: -> device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:86:00.0, compute capability: 7.0
tf.Tensor(969.971, shape=(), dtype=float32)
Examples
- Slurm - Job Script Example 08 TensorFlow using virtualenv
- Slurm - Job Script Example 08a TensorFlow multi-GPU using virtualenv