Sponsored Content
Top Forums Shell Programming and Scripting Performance issue in shell script Post 302908461 by ureddy on Tuesday 8th of July 2014 04:30:05 AM
Old 07-08-2014
Hammer & Screwdriver Performance issue in shell script

Hi All,


I am facing performance issue while rinning the LINUX shell script.

I have file1 and file 2. File one is the source file and file 2 is lookup file. Need to replace if the pattern is matching in file1 with file2.
The order of lookup file is important as if any match then exit from loop and no need to search further for that record and continue search for next record.
Code:
file1
------
one|xxxx|111111NEW YORK|abcd
two|yyy|TEXAS 222222TEXASTEXAS|defg
three|zzzz|CALIFORNIA TEXAS TEXAS 3333 CALIFORNIA|defg
four|kkkk|DALLAS DALLAS|defg
 
file2
-----
NEW YORK,NY
CALIFORNIA,CA
TEXAS,TX

If the file2 record 1st field matches with file1 record 3rd field then I need to do the below things.
  1. if the string present only once then dont replace string and just add filed2 from lookup and |2|N at the end of line
  2. if the string present more than once then leave the first occurence of string and replace the rest of occurences and add |2|Y at end of line.
if there is no match then just add space and |2|N at the end of line

So output is below.

Code:
 
one|xxxx|111111NEW YORK|abcd|NY|2|N (NEY YORK matched but present only once so not replacing. Also as match found exit from loop and no need to search and replace)
two|yyy|TEXAS 222222TXTX|defg|TX|2|Y (TEXAS present more than once and replacing from 2nd occurence and leaving the first occurence) 
three|zzzz|CALIFORNIA TEXAS TEXAS 3333 CA|defg|CA|2|Y ( only replaced the 2nd occurence of CALIFORNIA. TEXAS not replaced because if any match already done(CALIFORNIA) then no need to replace rest of matches so exit from loop.
four|kkkk|DALLAS DALLAS|defg| |2|N (no match so not replaced any thing)

I have tested the below code and its working fine but taking much time. Its processing 1 record for 1 second and I have 1000000 records to process and taking much time.
Can any one help me in tunig this script.

CODE is below

Code:
echo "Replace the string matches only once or except FIRST occurence replace ALL." >>$LOG
tot_cnt=`wc -l < $REP_FILE_PATH/$REP_FILE`
del_tmp_files
 
while IFS='' read -r line; do (to preserve leading and trailing spacees used IFS='' read -r )
i=0
while read rep_line; do
field[1]=`cut -d',' -f1 <<<"$line"`
field[2]="`cut -d',' -f2 <<<"$line"`
cnt=`echo -n "$line" | grep -o "${field[1]}" | wc -l`
if [[ "$cnt" -eq 1 ]] ; then
sed -e "s/$/|"${field[2]}"|2|N/" <<<"$line" >> tmp.txt'
break
fi
if [[ "$cnt" -gt 1 ]] ; then
sed -e "s/"${field[1]}"/"${field[2]}"/2g" -e "s/$/|"${field[2]}"|2|Y/" <<<"$line" >> tmp.txt
break
fi
let i++
if [[ "$cnt" -eq 0 && "$tot_cnt" -eq $i ]] ; then
sed -e "s/$/|" "|2|N/" <<<"$line" >> tmp.txt
fi
done < file2.txt
done< file1.txt


Last edited by rbatte1; 08-11-2014 at 12:49 PM.. Reason: Added codes
 

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

Performance of a shell script

Hiii, I wrote a shell script for testing purpose. I have to test around 200thousand entries with the script.When i am doing only for 6000 entries its taking almost 1hour.If i test the whole testingdata it will take huge amount of time. I just want to know is it something dependent on the... (2 Replies)
Discussion started by: namishtiwari
2 Replies

2. Shell Programming and Scripting

Performance issue with awk script.

Hi, The below awk script is taking about 1 hour to fetch just 11 records(columns). There are about 48000 records. The script file name is take_first_uniq.sh #!/bin/ksh if then while read line do first=`echo $line | awk -F"|" '{print $1$2$3}'` while read line2 do... (4 Replies)
Discussion started by: RRVARMA
4 Replies

3. Shell Programming and Scripting

Performance issue with ftp script.

Hi All, I have written a script to FTP files from local server to remote server. When i try it for few number of files the scripts runs successfully. But the same script when i run for 200-300 files it gives me performanace issue by aborting the connection. Please help me out to improve the... (7 Replies)
Discussion started by: Shiv@jad
7 Replies

4. UNIX for Advanced & Expert Users

FTP-Shell Script-Performance issue

Hello All, Request any one of Unix/Linux masters to clarify on the below. How far it is feasible to open a new ftp connection for transferring each file when there are multiple files to be sent. I have developed shell script to send all files at single stretch but some how it doesnt suit to... (3 Replies)
Discussion started by: RSC1985
3 Replies

5. Shell Programming and Scripting

Improve the performance of a shell script

Hi Friends, I wrote the below shell script to generate a report on alert messages recieved on a day. But i for processing around 4500 lines (alerts) the script is taking aorund 30 minutes to process. Please help me to make it faster and improve the performace of the script. i would be very... (10 Replies)
Discussion started by: apsprabhu
10 Replies

6. Shell Programming and Scripting

Script performance issue

hi i have written a shell script which comapare a text file data with files within number of different directories. example. Text File: i have a file /u02/abc.txt which have almost 20000 file names Directories: i have a path /u03 which have some subdirectories like a,b,c which have almost... (2 Replies)
Discussion started by: malikshahid85
2 Replies

7. UNIX for Dummies Questions & Answers

awk script performance issue

Hello All, I have the below excerpt of code in my shell script and it taking long time to complete, though it prints the output quickly. Is there a way to make it come out once it finds the first instance as the file size of 4.7 GB it could be going through all lines of the data file to find for... (3 Replies)
Discussion started by: Ariean
3 Replies

8. Shell Programming and Scripting

Linux shell programming performance issue

Hi All, can any one help me on this please. Replace sting in FILE1.txt with FILE2.txt. FILE1.txt record must have at least one state is repeated once.But need to replace only from second occurrence in record in FILE1.txt Condition: order of searching the records in FILE2.txt is impartent.... (8 Replies)
Discussion started by: ureddy
8 Replies

9. Shell Programming and Scripting

Performance problem in Shell Script

Hi, I am Shell script beginner. I wrote a shell programming that will take each line of a file1 and search for it in another file2 and give me the output of the lines that do not exist in the file2. I wrote it using do while nested loop but the problem here is its running for ever . Is there... (12 Replies)
Discussion started by: sakthisivi
12 Replies

10. Shell Programming and Scripting

Performance Issue - Shell Script

Hi, I am beginner in shell scripting. I have written a script to parse file(s) having large number of lines each having multiple comma separated strings. But it seems like script is very slow. It took more than 30mins to parse a file with size 120MB (523564 lines), below is the script code ... (4 Replies)
Discussion started by: imrandec85
4 Replies
All times are GMT -4. The time now is 06:50 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy