Skip to content

Slurm - Job Script Example 06 Matlab

This example runs a Matlab[1] script as a job.

Code

Matlab script

The Matlab script is a file named myprog.m. The script implements the Mandelbrot Set [2] and saves output of three images into three .png [3] files.

maxIterations = 50000;
gridSize = 4000;
xlim = [-0.748766713922161, -0.748766707771757];
ylim = [ 0.123640844894862,  0.123640851045266];

%versionCPU
% Setup
t = tic();
x = linspace( xlim(1), xlim(2), gridSize );
y = linspace( ylim(1), ylim(2), gridSize );
[xGrid,yGrid] = meshgrid( x, y );
z0 = xGrid + 1i*yGrid;
count = ones( size(z0) );

% Calculate
z = z0;
for n = 0:maxIterations
    z = z.*z + z0;
    inside = abs( z )<=2;
    count = count + inside;
end
count = log( count );

% Show
cpuTime = toc( t );
fig = gcf;
fig.Position = [200 200 600 600];
imagesc( x, y, count );
colormap( [jet();flipud( jet() );0 0 0] );
axis off
title( sprintf( '%1.2fsecs (without GPU)', cpuTime ) );
saveas(gcf,'withoutGPU.png');

%naiveGPU-use gpu_array
% Setup
t = tic();
x = gpuArray.linspace( xlim(1), xlim(2), gridSize );
y = gpuArray.linspace( ylim(1), ylim(2), gridSize );
[xGrid,yGrid] = meshgrid( x, y );
z0 = complex( xGrid, yGrid );
count = ones( size(z0), 'gpuArray' );

% Calculate
z = z0;
for n = 0:maxIterations
    z = z.*z + z0;
    inside = abs( z )<=2;
    count = count + inside;
end
count = log( count );

% Show
count = gather( count ); % Fetch the data back from the GPU
naiveGPUTime = toc( t );
imagesc( x, y, count )
axis off
title( sprintf( '%1.3fsecs (naive GPU) = %1.1fx faster', ...
    naiveGPUTime, cpuTime/naiveGPUTime ) )
saveas(gcf,'naiveGPU.png');

%use GPU_array_fun_time
% Setup
t = tic();
x = gpuArray.linspace( xlim(1), xlim(2), gridSize );
y = gpuArray.linspace( ylim(1), ylim(2), gridSize );
[xGrid,yGrid] = meshgrid( x, y );

% Calculate
count = arrayfun( @pctdemo_processMandelbrotElement, ...
                  xGrid, yGrid, maxIterations );

% Show
count = gather( count ); % Fetch the data back from the GPU
gpuArrayfunTime = toc( t );
imagesc( x, y, count )
axis off
title( sprintf( '%1.3fsecs (GPU arrayfun) = %1.1fx faster', ...
    gpuArrayfunTime, cpuTime/gpuArrayfunTime ) );
saveas(gcf, 'GPUarrayfun.png');

%working-with-kernel
% Load the kernel
cudaFilename = 'pctdemo_processMandelbrotElement.cu';
ptxFilename = ['pctdemo_processMandelbrotElement.',parallel.gpu.ptxext];
kernel = parallel.gpu.CUDAKernel( ptxFilename, cudaFilename );

% Setup
t = tic();
x = gpuArray.linspace( xlim(1), xlim(2), gridSize );
y = gpuArray.linspace( ylim(1), ylim(2), gridSize );
[xGrid,yGrid] = meshgrid( x, y );

% Make sure we have sufficient blocks to cover all of the locations
numElements = numel( xGrid );
kernel.ThreadBlockSize = [kernel.MaxThreadsPerBlock,1,1];
kernel.GridSize = [ceil(numElements/kernel.MaxThreadsPerBlock),1];

% Call the kernel
count = zeros( size(xGrid), 'gpuArray' );
count = feval( kernel, count, xGrid, yGrid, maxIterations, numElements );

% Show
count = gather( count ); % Fetch the data back from the GPU
gpuCUDAKernelTime = toc( t );
imagesc( x, y, count )
axis off
title( sprintf( '%1.3fsecs (GPU CUDAKernel) = %1.1fx faster', ...
    gpuCUDAKernelTime, cpuTime/gpuCUDAKernelTime ) );
saveas(gcf,'CUDA.png');

Job script

The job script is a file named PicotteEx.sh, in the same directory as myprog.m :

#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=16
#SBATCH --mem=128G
#SBATCH --time=1:00:00
#SBATCH --nodelist=gpu002
#SBATCH --gres=gpu:1
#SBATCH -p gpu

module load matlab
module load cuda11.0/toolkit

matlab -nodisplay -nodesktop -nosplash -noFigureWindows < $PWD/myprog.m

Job submission

Submit the job script:

[juser@picotte001 ~]$ sbatch PicotteEx.sh

Output

Output has 4 images:

800px|Image: 800 pixels 800px|Image: 800 pixels 800px|Image: 800 pixels 800px|Image: 800 pixels

Cost

So, based on the published rates we posted about Usage Rates [4], the cost to run Mandelbrot on CPU or GPU is below:

GPU/CPU devices Run time(secs) SU per unit resource Cost per SU Cost to run(cents)
CPU 136.74 1/core-hour 0.0123 2.24
Naive GPU 7.068 43/device-hour 0.0123 0.1038
GPU arrayfun 0.320 43/device-hour 0.0123 0.0047
GPU CUDA Kernel 0.302 43/device-hour 0.0123 0.0044

References

[1] MATLAB

[2] MATLAB Documentation - Illustrating Three Approaches to GPU Computing: The Mandelbrot Set

[3] MATLAB Documentation - saveas

[4] Usage Rates