Job Script Example 02 Many Input Files
Description
This is similar to Job Script Example 01 Many Input Files, except that the files are not named sequentially. Instead, the needed file names are listed in a separate file.
The example script and input files are in:
/mnt/HA/opt/Examples/Example02
Input File
The input file is named list_of_files.txt
. Contents are:
2KontuepFego.txt
atph7QuodsId.txt
glyrydrivUs5.txt
hidAjyonOct2.txt
ikDugOdcayp4.txt
irdaikIbDik8.txt
JuiberAnNup1.txt
KrighwennAr6.txt
mepViavejub7.txt
NidgitOtElm0.txt
rosivPicdon9.txt
scomghovJer1.txt
SwiviphEpur5.txt
tyWocaibyav3.txt
VoryifuttEk1.txt
Whan8Harhij6.txt
wivErtAcper3.txt
yabhavnekIb9.txt
Script
File names are listed in a file
This script demonstrates a useful pattern when the input arguments are not numerically sequential. Create an array variable[1] to store the arguments to be passed, and index into that array using the SGE_TASK_ID.
#!/bin/bash
#$ -S /bin/bash
#$ -N example02a
#$ -j y
#$ -cwd
#$ -M fixme@drexel.edu
#$ -P fixmePrj
#$ -l h_rt=300
#$ -q all.q@@amdhosts
#$ -t 1:18:1
### NOTE: you must know the number of files at the time of qsub
. /etc/profile
module load shared
module load sge/univa
declare -a filenames=( $( cat list_of_files.txt ) )
### NOTE: bash array indices start at 0, but SGE task IDs start at 1
taskid=$(printf %02d $SGE_TASK_ID)
sed -e 's/hello/goodbye/' ${filenames[$( expr $SGE_TASK_ID - 1 )]} > moddata${taskid}.txt
Input files are sequential but not listed in a separate file
This is similar to the above, but the files are named in some sequential manner.
#!/bin/bash
#$ -S /bin/bash
#$ -N example02b
#$ -j y
#$ -cwd
#$ -M fixme@drexel.edu
#$ -P fixmePrj
#$ -l h_rt=300
#$ -q all.q@@amdhosts
#$ -t 1:18:1
### NOTE: you must know the number of files at the time of qsub
. /etc/profile
module load shared
module load sge/univa
### The input files are named in some ordered way, e.g. aaa.input, aab.input, aac.input, ...
declare -a filenames=( $( /bin/ls -1 *.input ) )
### NOTE: bash array indices start at 0, but SGE task IDs start at 1
### You could change SGE task IDs by doing #$ -t 0:17:1 instead
taskid=$(printf %02d $SGE_TASK_ID)
sed -e 's/hello/goodbye/' ${filenames[$( expr $SGE_TASK_ID - 1 )]} > moddata${taskid}.txt