Split list of files into an array and pass to function
There are two parts to this. In the first part I need to read a list of files from a directory and split it into 4 arrays. I have done that with the following code,
Code:
# collect list of file names
STATS_INPUT_FILENAMES=($(ls './'$SET'/'$FOLD'/'*'in.txt'))
# get number of files
NUM_INPUT_FILES=${#STATS_INPUT_FILENAMES[@]}
# get size of each subset
PROC_SIZE=$((NUM_INPUT_FILES / 4))
# create array start and stop positions
let "START2 = $PROC_SIZE+1"; let "STOP2 = $START2+$PROC_SIZE";
let "START3 = $STOP2+1"; let "STOP3 = $START3+$PROC_SIZE";
let "START4 = $STOP3+1"
# create 4 arrays, each wiht 25% of filenames
NUM_FILE_LISTS='4'
FILE_LIST_0=("${STATS_INPUT_FILENAMES[@]:0:$PROC_SIZE}")
FILE_LIST_1=("${STATS_INPUT_FILENAMES[@]:$START2:$STOP2}")
FILE_LIST_2=("${STATS_INPUT_FILENAMES[@]:$START3:$STOP3}")
FILE_LIST_3=("${STATS_INPUT_FILENAMES[@]:$START4:$NUM_INPUT_FILES}")
This is not very elegant, but I think it has the list split up
Next, I need to pass each of the 4 lists to a bash function, but I can't seem to find a reasonable syntax for doing that. Suggestions would be appreciated.
It may have the list split up, but the 4 lists are no where close to containing the same number of elements. The construct grabbing a subset of the array elements in ksh93 and recent versions of bash is not:
Code:
${arrary[@]:start_index:end_index}
it is:
Code:
${arrary[@]:start_index:number_of_elements}
If we had a list of 10 files named 1 through 10 the four lists created by your code would be:
Code:
2:1 2
5:4 5 6 7 8
4:7 8 9 10
1:10
where the number before the colon is the number of files in the list and the numbers after the colon are the files in that list. (Note that files 7, 8, and 10 are in two lists and file 3 isn't in any list, and the number of files in the lists are 2, 5, 4, and 1.)
To get more even lists (and get each file in exactly one of your four lists), you could try something more like:
Code:
# collect list of file names
STATS_INPUT_FILENAMES=($(ls './'$SET'/'$FOLD'/'*'in.txt'))
STATS_INPUT_FILENAMES=(1 2 3 4 5 6 7 8 9 10) # For testing only.
# get number of files
NUM_INPUT_FILES=${#STATS_INPUT_FILENAMES[@]}
# create 4 arrays, each with ~25% of filenames
NUM_FILE_LISTS='4'
# get size of each subset
BASE_LIST_SIZE=$(((NUM_INPUT_FILES) / NUM_FILE_LISTS))
LEFTOVER=$((NUM_INPUT_FILES % NUM_FILE_LISTS))
LIST_SIZE0=$((BASE_LIST_SIZE + (LEFTOVER > 0)))
LIST_SIZE1=$((BASE_LIST_SIZE + (LEFTOVER > 1)))
LIST_SIZE2=$((BASE_LIST_SIZE + (LEFTOVER > 2)))
FILE_LIST_0=("${STATS_INPUT_FILENAMES[@]:0:$LIST_SIZE0}")
FILE_LIST_1=("${STATS_INPUT_FILENAMES[@]:$LIST_SIZE0:$LIST_SIZE1}")
FILE_LIST_2=("${STATS_INPUT_FILENAMES[@]:$((LIST_SIZE0 + LIST_SIZE1)):$LIST_SIZE2}")
FILE_LIST_3=("${STATS_INPUT_FILENAMES[@]:$((LIST_SIZE0 + LIST_SIZE1 + LIST_SIZE2))}")
echo ${#FILE_LIST_0[@]}:${FILE_LIST_0[@]}
echo ${#FILE_LIST_1[@]}:${FILE_LIST_1[@]}
echo ${#FILE_LIST_2[@]}:${FILE_LIST_2[@]}
echo ${#FILE_LIST_3[@]}:${FILE_LIST_3[@]}
Which with the same list of 10 files produces the output:
Code:
3:1 2 3
3:4 5 6
2:7 8
2:9 10
Passing arrays to a function is tricky. The easier approach is to pass any fixed arguments as the 1st arguments to your functions and pass the filenames as a variable argument list with "${FILE_LIST_x[@]}".
Hope this helps...
This User Gave Thanks to Don Cragun For This Post:
Passing arrays to a function is tricky. The easier approach is to pass any fixed arguments as the 1st arguments to your functions and pass the filenames as a variable argument list with "${FILE_LIST_x[@]}".
I think that I am going to avoid passing the array for now and see how it goes. I can pass LIST_SIZE0 and LIST_SIZE* and let the function create each sub list. This will mean repeating STATS_INPUT_FILENAMES=($(ls './'$SET'/'$FOLD'/'*'in.txt')) for each function call, but I will put up with that for now.
I guess I misunderstood the syntax for grabbing part of an array. The most important issue here is making sure that each file is on exactly one list. The second priority is making the lists as even as possible.
I can see two ways to pass an array to a function, at least for my bash 4.3.30:
- pass the element count and then the elements scr1.sh A B ${#LIST[@]} ${LIST[@] C D }, run a for loop to assign to a local array
- pass the array like scr1.sh A B "${LIST[*]}" C D ; define local array like ARR=($3)
This is what I have set up instead of passing the array.
calling code
Code:
# the number of availalbe cores
if [ "$CORES" == "quad" ]; then
# create 4 arrays, each with ~25% of filenames
NUM_FILE_LISTS='4'
PROCESSED='0'
# get size of each subset
BASE_LIST_SIZE=$(((NUM_INPUT_FILES) / NUM_FILE_LISTS))
LEFTOVER=$((NUM_INPUT_FILES % NUM_FILE_LISTS))
# set up start elements and number of elements for all lists
# list 0
START_ELEMENT_0='0'
NUMBER_OF_ELEMENTS_0=$((BASE_LIST_SIZE + (LEFTOVER > 0)))
# keep track of number of files processed
let "PROCESSED=$PROCESSED+$NUMBER_OF_ELEMENTS_0"
# list 1
START_ELEMENT_1=$PROCESSED
#let "START_ELEMENT_1=$START_ELEMENT_0+$NUMBER_OF_ELEMENTS_0"
NUMBER_OF_ELEMENTS_1=$((BASE_LIST_SIZE + (LEFTOVER > 1)))
let "PROCESSED=$PROCESSED+$NUMBER_OF_ELEMENTS_1"
# list 2
START_ELEMENT_2=$PROCESSED
NUMBER_OF_ELEMENTS_2=$((BASE_LIST_SIZE + (LEFTOVER > 2)))
# keep track of number of files processed
let "PROCESSED=$PROCESSED+$NUMBER_OF_ELEMENTS_2"
# list 3
START_ELEMENT_3=$PROCESSED
# assign the rest to this list
let "NUMBER_OF_ELEMENTS_3=$NUM_INPUT_FILES-$PROCESSED"
# keep track of number of files processed
let "PROCESSED=$PROCESSED+$NUMBER_OF_ELEMENTS_3"
# call functions to process stats
run_stats_program $SET $FOLD $START_ELEMENT_0 $NUMBER_OF_ELEMENTS_0 &
# to prevent terminal overrun
sleep 2
run_stats_program $SET $FOLD $START_ELEMENT_1 $NUMBER_OF_ELEMENTS_1 &
sleep 2
run_stats_program $SET $FOLD $START_ELEMENT_2 $NUMBER_OF_ELEMENTS_2 &
sleep 2
run_stats_program $SET $FOLD $START_ELEMENT_3 $NUMBER_OF_ELEMENTS_3 &
sleep 2
# wait untill subshells have returned
wait
fi
called function
Code:
function run_stats_program {
# function args
SET_F=$1
FOLD_F=$2
START_ELEMENT_F=$3
NUMBER_OF_ELEMENTS_F=$4
# get list of stats input files in fold directory
STATS_INPUT_FILENAMES_F=($(ls './'$SET_F'/'$FOLD_F'/'*'in.txt'))
# create file list as subest of STATS_INPUT_FILENAMES_F
FILE_LIST=("${STATS_INPUT_FILENAMES_F[@]:$START_ELEMENT_F:$NUMBER_OF_ELEMENTS_F}")
for INPUT_FILE in "${FILE_LIST[@]}"
do
echo $INPUT_FILE
done
}
All this does at this point is print the filenames. In the end, this will process the 4 file lists in 4 subshells. Processing involved calling a c++ widget to process each file. This setup allows 4 instances of the c++ app to run simultaneously and use availalble CPU resources. There will be a similar code block for hex core.
I get this this is written in long form at the moment. It would be nice for the code to be a bit more compact and elegant, but I don't see a clear way to put the function calls in a loop or something like that.
LMHmedchem
Last edited by LMHmedchem; 01-13-2015 at 05:52 PM..
I'm able to read & print an array in varaible called "filelist"
I need to pass this array variable to a function called verify() and then read and loop through the passed array inside the function.
Unfortunately it does not print the entire array from inside the funstion's loop.
#/bin/ksh... (5 Replies)
Hello,
I need to collect some statistical results from a series of files that are being generated by other software. The files are tab delimited. There are 4 different sets of statistics in each file where there is a line indicating what the statistic set is, followed by 5 lines of values. It... (8 Replies)
Hi All
I have multiple arrays like below.
set -A val1 1 2 4 5
set -A val2 a b c d
.
.
.
Now i would like to pass the individual arrays one by one to a function and display/ do some action.
Note : I am using ksh
Can you please advise any solution...
Thanks in advance. (7 Replies)
Hi, guys
I just wanted to sort the elements of an array ascendingly.
I know the following code does work well:
array=(13 435 8 23 100)
for i in {0..4}
do
j=$((i+1))
while ]
do
if } -le ${array} ]]
then :
else
min=${array}
${array}=${array}
${array}=$min
fi... (5 Replies)
hi,
I have a array say
SAP_ARRAY="s1.txt"
SAP_ARRAY="s2.txt"
how can i pass this full array to a function.
here is the sample code i am using..
CHECK_NO_FILES()
{
FARRAY=$1
echo "FARRAY = $FARRAY"
echo "FARRAY = $FARRAY"
............... (5 Replies)
Hi All,
the below is my requirement..
i need to split the file based on line and put that files in a array and need to access that files through loop finally i should send the files through mail..
how can we achieve this ..I am new to shell script please guide me..
I am using KSH..
... (11 Replies)
Hi,.
I am writing a script to get the new files and split them.
Requirement
Find the new files under the path "/wload/scmp/app/data/OAS" (There are 5 sub folders).
Gunzip the files which are having .gz suffix.
Put the list of files in the filename in the format... (0 Replies)
Hi,
I have an output generated from a shell script like;
0x41,0xF2,0x59,0xDD,0x86,0xD3,0xEF,0x61,0xF2
How can I pass this value to the C function, as below;
int main(int argc, char *argv) {
unsigned char hellopdu={above value};
}
Regards
Elthox (1 Reply)
Hi,
If this is the array that is being returned to me:
How would I get the values for each of the 3 records?
This works for 1 Record:
foreach $item (@results)
{
($id, $id2, $name, $date, $email) = split(/\|/, $item, 5);
print "$name<br>";
} (2 Replies)
I want to pass an array in my function, And my function will be changing the elements of the array in the fuction, but it should not affect the values in my array variable of main function (1 Reply)