Alternative solution to nested loops in shell programming

 
Thread Tools Search this Thread
Homework and Emergencies Homework & Coursework Questions Alternative solution to nested loops in shell programming
# 1  
Old 09-21-2015
Alternative solution to nested loops in shell programming

Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted!

1. The problem statement, all variables and given/known data:

Hi,

The problem statement is: I am trying to read line by line from a flat file by using a while loop. The flat file will contain 100k records and each record will have 25 columns. While reading each line, I have to read some values from an array and create a map of the values of the array and the fields extracted from each line. I tried using a for inside the while loop, but that is killing the performance. I would like to know any alternate approach to avoid the nested loops. Any help would be greatly appreciated.


2. Relevant commands, code, scripts, algorithms:

Command to run the script:

Create_Index.ksh <config_file> "ABC" 1

Indexfields_1 will contain the values separated by "," for which the mapping needs to be created.

E.g: "A","B","C", "D" ...... like that 25 fields

Code:
#!/usr/bin/ksh

if [[ $# != 3 ]];then
        echo "Incorrect No .of aurguments sent to script"
        echo "Usage: Create_Index.ksh <config_file_name><table_identifier><segment_number> "
        echo "Insufficient parameters to continue execution. Exiting the $(basename ${0}) script with 1 at $(date)"
        exit 1
fi


config_file=${1}

if [ -s ${config_file} ]
then
        . ${config_file}
else
        log "Config file not found"
fi


#-------------------------------------
# function to log message to log file
#-------------------------------------
function log
{
        msg="$1"

        echo "== $(date '+%m/%d/%Y %H:%M:%S')  :${msg}" >>${IndexCreation_DAILY_LOG}
}

#-------------------------------------
# function ends
#------------------------------------


base_dir="${BASE_DIR}"
afp_dir="${AFP_DIR}"
index_dir="${INDEX_DIR}/$2/$2$3"
log_dir="${LOG_DIR}/$2/$2$3"
trigger_dir="${TRIGGER_DIR}/$2/$2$3"
log_filename_suffix="${LOG_FILENAME_SUFFIX}"
output_file_path="${OUTPUT_FILE_PATH}/$2$3"
IndexCreation_DAILY_LOG=${log_dir}/${log_filename_suffix}.$(date +%m%d%y_%H%M%S)
metadata_file_name="${METADATA_FILENAME}"
trigger_file_prefix=`basename ${metadata_file_name%.dat}`
trigger_file_name="${trigger_file_prefix}.indexing"


if [[ ! -d "${log_dir}" ]];then
mkdir -p "${log_dir}"
fi

#rm -rf ${index_dir}/*

if [[ ! -d "${index_dir}" ]];then
mkdir -p "${index_dir}"
fi

if [[ ! -d "${afp_dir}" ]];then
mkdir -p "${afp_dir}"
fi

log "**********************************************************************************"
log "********Script**started**at***$(date '+%m/%d/%Y %H:%M:%S')************************"
log "**********************************************************************************"

rm -rf ${index_dir}/*

if [ $? != 0 ]
then
log "Unable to delete the old index files. Indexing failed, so creating failed trigger"
> ${trigger_dir}/${trigger_file_prefix}.indexfailed
exit 1

else
log "Successfully deleted the old index files from the directory ${index_dir}"
fi

identifier=$2
declare -i i=1
declare -i outfilecount=0

#Fetches the index values for the identifier passed in the argument
grep $identifier Indexfields_1 > tempfile1
indexfieldsnumber=`awk 'BEGIN {FS=","} ; END{print NF}' tempfile1`
log "fields to be present in undex file are $indexfieldsnumber"
cat tempfile1


#Populates the fetched index values from previous step in an array.
declare -i j=1
declare -i k=0
while [[ $j -le $indexfieldsnumber ]] ; do
indexfieldname=`cut -d "," -f${j} tempfile1`
array[${k}]="$indexfieldname"
j=$j+1
k=$k+1

done
#Finished populating the index fields values for an identifier in the array.

declare -i outfilecount=0
declare -i numberoflinesread=0
declare -i linenumber=0 #debug purpose

while read line #read the metadata file
do

record="$line"
#record=$(echo "${record}" | tr -d '[[:space:]]')

declare -i mdfieldcount=0
declare -i arrayfieldnum=0

for fieldposition in "${array[@]}" #read the field name
        do

      #  groupfieldvalue=`echo ${line} | cut -d , -f${mdfieldcount}`

        #echo "fieldposition is $fieldposition and value is $groupfieldvalue"


        if [[ ${fieldposition} != ${2} ]]
        then
        groupfieldvalue=`echo ${line} | cut -d , -f${mdfieldcount}`
        groupfieldvalue=$(echo "${groupfieldvalue}" | tr -d '[[:space:]]')

#       if [[ $? != 0 ]]
#       then
#       log "unable to find the group field value for ${fieldposition}"
#       mv ${trigger_file_name} ${trigger_file_prefix}.failed
#       fi

                if [[ ${fieldposition} != "${DOCUMENT_NAME}" && ${fieldposition} != "${DOCUMENT_OFFSET}" && ${fieldposition} != "${DOCUMENT_LENGTH}" && ${fieldposition} != "${COMP_OFFSET}" && ${fieldposition} != "${COMP_LENGTH}" ]]
                then
                        echo  "GROUP_FIELD_NAME:${fieldposition}" >> ${index_dir}/afp${i}.ind
                        echo  "GROUP_FIELD_VALUE:${groupfieldvalue}" >> ${index_dir}/afp${i}.ind
                fi
        fi

        if [[ ${fieldposition} == "${DOCUMENT_NAME}" ]]
        then
        docname=${groupfieldvalue}
        docname="$(echo "$docname" | tr -d ' ')"
        fi

        if [[ ${fieldposition} == "${DOCUMENT_OFFSET}" ]]
        then
        docoff=${groupfieldvalue}
        fi

        if [[ ${fieldposition} == "${DOCUMENT_LENGTH}" ]]
        then
        doclen=${groupfieldvalue}
        fi

        if [[ ${fieldposition} == "${COMP_LENGTH}" ]]
        then
        complength=${groupfieldvalue}
        fi

        if [[ ${fieldposition} == "${COMP_OFFSET}" ]]
        then
        compoffset=${groupfieldvalue}
        fi

        filename="Decomp_${docname}_${compoffset}_${complength}.out"
        indexfilename="Decomp_${docname}_${compoffset}_${complength}.ind"
        filename=$(echo "${filename}" | tr -d '[[:space:]]')
        indexfilename=$(echo "${indexfilename}" | tr -d '[[:space:]]')
        currentfilename=$filename

        if [[ $previousfilename != $currentfilename ]]
        then
        newcompoffset=true

        fi

        mdfieldcount=${mdfieldcount}+1 #Increment the metadata field count to fetch the next value from the metadt file

        done

        echo "GROUP_OFFSET:${docoff}" >> ${index_dir}/afp${i}.ind
        echo "GROUP_LENGTH:${doclen}" >> ${index_dir}/afp${i}.ind
        echo "GROUP_FILENAME:${output_file_path}/${filename}" >> ${index_dir}/afp${i}.ind


        #debug purpose only

        if [[ $linenumber == 5000 ]]; then

        i=i+1
        linenumber=0
        echo  "CODEPAGE:850" >> ${index_dir}/afp${i}.ind

        fi


        #debug purpose only

       echo "finished processing for $linenumber"
       linenumber=linenumber+1


done < ${metadata_file_name}

log "removing the temp file containing the indexed fields"
rm -rf tempfile
rm -rf  ${index_dir}/afp*.ind

mv "${trigger_dir}/${trigger_file_prefix}.indexinprogress" "${trigger_dir}/${trigger_file_prefix}.indexed"

log "*************************************************************************************************"
log "********Script***completed**at***$(date '+%m/%d/%Y %H:%M:%S')*************************************"
log "*************************************************************************************************"

3. The attempts at a solution (include all code and scripts):

Included.

4. Complete Name of School (University), City (State), Country, Name of Professor, and Course Number (Link to Course):
Utkal University, IND.

Note: Without school/professor/course information, you will be banned if you post here! You must complete the entire template (not just parts of it).

Last edited by Sandeep Pattnai; 09-21-2015 at 01:52 PM..
# 2  
Old 09-21-2015
Please provide the information for #4 above - THANK YOU.

PS: you invoke ksh but seem to have some bash code in your example. It will not run.
This User Gave Thanks to jim mcnamara For This Post:
# 3  
Old 09-21-2015
Jim,

The code runs, but the performance is slow. The for loop inside the while is causing the issue. It would be great if u can provide an alternate approach to avoid this nested loop.
# 4  
Old 09-21-2015
True, the code is not ksh. I guess there is
Code:
$ ls -l /usr/bin/ksh
... -> /bin/bash

#4 the School/University is still missing!
# 5  
Old 09-21-2015
Some data samples might help. Wouldn't a performance / time profile make sense?
This User Gave Thanks to RudiC For This Post:
# 6  
Old 09-21-2015
It doesn't look like it's the loop that's the problem, to me. It's the creation of all those tiny files, and all those external tr -d calls, and the >> re-opening the same file over and over and over.
This User Gave Thanks to Corona688 For This Post:
# 7  
Old 09-21-2015
Hi Corona688/Rudi C,

As per your suggestion, I have changed the tr -d with the "sed" to remove spaces. But still I am seeing the same performance. Could you please suggest some alternate solution to this problem?

Thanks
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

two while nested loops

for server in $(echo `cat /tmp/ScanHosts_${USERSNAME}.TXT`) do for portnumber in $(echo `cat /tmp/ScanPorts_${USERSNAME}.TXT`) do #echo ${server} ${portnumber} ... (3 Replies)
Discussion started by: SkySmart
3 Replies

2. UNIX for Dummies Questions & Answers

Executing nested loops+foreach

It's been a while since I used csh formatting and I am having a little bit of trouble with a few things. Things seem so much easier to execute in Matlab, however I need to do this on the terminal because of the programs I am trying to interact with. So here's what I want to do: I have a file... (0 Replies)
Discussion started by: katia
0 Replies

3. Shell Programming and Scripting

Nested for loops

Greetings All, The following script attempts to enumerate all users in all groups in the group file(GROUP) and echo the following information: GROUP ---> USER The script is as follows: IFS="," for GROUP in `ypcat -k group | cut -d" " -f1` do for USER in `ypcat -k group... (13 Replies)
Discussion started by: jacksolm
13 Replies

4. UNIX for Dummies Questions & Answers

Faster than nested while read loops?

Hi experts, I just want to know if there is a better solution to my nested while read loops below: while read line; do while read line2; do while read line3; do echo "$line $line2 $line3" done < file3.txt done < file2.txt done < file1.txt >... (4 Replies)
Discussion started by: chstr_14
4 Replies

5. Shell Programming and Scripting

KSH nested loops?

KSH isn't my strong suit but it's what my company has to offer. I've got a script with two nested loops, a FOR and UNTIL, and that works fine. When I add a CASE into the mix I end up getting "Unexpected 'done' at line xx" errors. Any suggestions on this? for divi in at ce ci cm co de di fl... (9 Replies)
Discussion started by: mrice
9 Replies

6. Shell Programming and Scripting

Nested while loops (ksh scripting)

You can use one while inside another? I made the following script (without really knowing if I can use two while) to get 3 numbers different from each other at random: num1=$(( $RANDOM % 10 )) num2=$num1 while do num2=$(( $RANDOM % 10 )) done num3=$num1 while do while do... (1 Reply)
Discussion started by: ale.dle
1 Replies

7. Shell Programming and Scripting

Korn Shell programming (FTP, LOOPS, GREP)

Hello All, I have another Korn shell question. I am writing a script that will ftp a file from my production database to my test database. To this, I have to construct a loop that checks a specified folder for a file. If the file exists, I want it execute the ftp protocol and then exit. ... (2 Replies)
Discussion started by: jonesdk5
2 Replies

8. Shell Programming and Scripting

nested for loops

I need help getting over this bump on how nested for loops work in shell. Say i was comparing files in a directory in any other language my for loop would look like so for(int i=0;to then end; i++) for(int y = i+1; to the end; y++) I can't seem to understand how i can translate that... (5 Replies)
Discussion started by: taiL
5 Replies

9. Shell Programming and Scripting

Grepping within nested for loops

Good morning - I have publication lists from 34 different faculty members. I need to end up with the numbers of publications in common across all 34 faculty. I need to grep person1 (last name) in list2, person1 in list3, person1 in list 4, etc., then person2 in list3, person 2 in list4, etc.,... (2 Replies)
Discussion started by: Peggy White
2 Replies
Login or Register to Ask a Question