Processing Many Sequentially Named Input Files♯

You have many input files to process (maybe you are converting data from one format to another). The file names are in sequential order:

rawdata01.txt rawdata02.txt ... rawdata13.txt

The example script plus input files are in

/ifs/opt/Examples/Example01

We will take advantage of Slurm job array functionality (#SBATCH --array) and to reference that via the SLURM_ARRAY_TASK_ID environment variable in the script.

Script♯

#!/bin/bash
#SBATCH --partition=def
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=13
#SBATCH --time=00:03:00
#SBATCH --mem=1GB

#SBATCH --array=1-13

. /etc/profile.d/modules.sh
module load shared
module load slurm/picotte

cd /ifs/groups/myGrp/juser

### Since we will have many simultaneous processes on possibly many nodes
### writing to the same directory, we use the BeeGFS filesystem
DATADIR=/beegfs/scratch/juser/Examples/Example01

### This makes the fileid an integer with zeroes in front so that all fileids are the same number of characters
fileid=$(printf %02d $SLURM_ARRAY_TASK_ID)

sed -e 's/hello/goodbye/' ${DATADIR}/rawdata${fileid}.txt > ${DATADIR}/moddata${fileid}.txt