Slurm - Job Script Example 06 Matlab
This example runs a Matlab[1] script as a job.
Code♯
Matlab script♯
The Matlab script is a file named myprog.m
. The script implements the
Mandelbrot Set [2] and saves output of three images into three .png [3]
files.
maxIterations = 50000;
gridSize = 4000;
xlim = [-0.748766713922161, -0.748766707771757];
ylim = [ 0.123640844894862, 0.123640851045266];
%versionCPU
% Setup
t = tic();
x = linspace( xlim(1), xlim(2), gridSize );
y = linspace( ylim(1), ylim(2), gridSize );
[xGrid,yGrid] = meshgrid( x, y );
z0 = xGrid + 1i*yGrid;
count = ones( size(z0) );
% Calculate
z = z0;
for n = 0:maxIterations
z = z.*z + z0;
inside = abs( z )<=2;
count = count + inside;
end
count = log( count );
% Show
cpuTime = toc( t );
fig = gcf;
fig.Position = [200 200 600 600];
imagesc( x, y, count );
colormap( [jet();flipud( jet() );0 0 0] );
axis off
title( sprintf( '%1.2fsecs (without GPU)', cpuTime ) );
saveas(gcf,'withoutGPU.png');
%naiveGPU-use gpu_array
% Setup
t = tic();
x = gpuArray.linspace( xlim(1), xlim(2), gridSize );
y = gpuArray.linspace( ylim(1), ylim(2), gridSize );
[xGrid,yGrid] = meshgrid( x, y );
z0 = complex( xGrid, yGrid );
count = ones( size(z0), 'gpuArray' );
% Calculate
z = z0;
for n = 0:maxIterations
z = z.*z + z0;
inside = abs( z )<=2;
count = count + inside;
end
count = log( count );
% Show
count = gather( count ); % Fetch the data back from the GPU
naiveGPUTime = toc( t );
imagesc( x, y, count )
axis off
title( sprintf( '%1.3fsecs (naive GPU) = %1.1fx faster', ...
naiveGPUTime, cpuTime/naiveGPUTime ) )
saveas(gcf,'naiveGPU.png');
%use GPU_array_fun_time
% Setup
t = tic();
x = gpuArray.linspace( xlim(1), xlim(2), gridSize );
y = gpuArray.linspace( ylim(1), ylim(2), gridSize );
[xGrid,yGrid] = meshgrid( x, y );
% Calculate
count = arrayfun( @pctdemo_processMandelbrotElement, ...
xGrid, yGrid, maxIterations );
% Show
count = gather( count ); % Fetch the data back from the GPU
gpuArrayfunTime = toc( t );
imagesc( x, y, count )
axis off
title( sprintf( '%1.3fsecs (GPU arrayfun) = %1.1fx faster', ...
gpuArrayfunTime, cpuTime/gpuArrayfunTime ) );
saveas(gcf, 'GPUarrayfun.png');
%working-with-kernel
% Load the kernel
cudaFilename = 'pctdemo_processMandelbrotElement.cu';
ptxFilename = ['pctdemo_processMandelbrotElement.',parallel.gpu.ptxext];
kernel = parallel.gpu.CUDAKernel( ptxFilename, cudaFilename );
% Setup
t = tic();
x = gpuArray.linspace( xlim(1), xlim(2), gridSize );
y = gpuArray.linspace( ylim(1), ylim(2), gridSize );
[xGrid,yGrid] = meshgrid( x, y );
% Make sure we have sufficient blocks to cover all of the locations
numElements = numel( xGrid );
kernel.ThreadBlockSize = [kernel.MaxThreadsPerBlock,1,1];
kernel.GridSize = [ceil(numElements/kernel.MaxThreadsPerBlock),1];
% Call the kernel
count = zeros( size(xGrid), 'gpuArray' );
count = feval( kernel, count, xGrid, yGrid, maxIterations, numElements );
% Show
count = gather( count ); % Fetch the data back from the GPU
gpuCUDAKernelTime = toc( t );
imagesc( x, y, count )
axis off
title( sprintf( '%1.3fsecs (GPU CUDAKernel) = %1.1fx faster', ...
gpuCUDAKernelTime, cpuTime/gpuCUDAKernelTime ) );
saveas(gcf,'CUDA.png');
Job script♯
The job script is a file named PicotteEx.sh
, in the same directory as
myprog.m
:
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=16
#SBATCH --mem=128G
#SBATCH --time=1:00:00
#SBATCH --nodelist=gpu002
#SBATCH --gres=gpu:1
#SBATCH -p gpu
module load matlab
module load cuda11.0/toolkit
matlab -nodisplay -nodesktop -nosplash -noFigureWindows < $PWD/myprog.m
Job submission♯
Submit the job script:
[juser@picotte001 ~]$ sbatch PicotteEx.sh
Output♯
Output has 4 images:
Cost♯
So, based on the published rates we posted about Usage Rates [4], the cost to run Mandelbrot on CPU or GPU is below:
GPU/CPU devices | Run time(secs) | SU per unit resource | Cost per SU | Cost to run(cents) |
---|---|---|---|---|
CPU | 136.74 | 1/core-hour | 0.0123 | 2.24 |
Naive GPU | 7.068 | 43/device-hour | 0.0123 | 0.1038 |
GPU arrayfun | 0.320 | 43/device-hour | 0.0123 | 0.0047 |
GPU CUDA Kernel | 0.302 | 43/device-hour | 0.0123 | 0.0044 |
References♯
[1] MATLAB
[2] MATLAB Documentation - Illustrating Three Approaches to GPU Computing: The Mandelbrot Set
[3] MATLAB Documentation - saveas
[4] Usage Rates