Slow Running Script (Reading 8000 lines)


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Slow Running Script (Reading 8000 lines)
# 1  
Old 07-12-2013
Slow Running Script (Reading 8000 lines)

Slow runnin script. The problem seems to be the sed calls.
In summary the script reads list of users in file1. For each
username search two files (file 1 & file2) for the username
and get the value in the next line after "=". Compare these
values with each other.

If the same then output to a file and if not output to another file.
For approx 8000 lines in file1 it takes approx 15 minutes to run?
not very good. Any suggestions on removing the bottleneck?

Script:
Code:
#!/bin/ksh
compareusernames() {
p_name="$1"
result_one=`sed -n "/$p_name/{n;p;}" $file2 | cut -d= -f2 | tr -d ' '`
result_two=`sed -n "/$p_name/{n;p;}" $file3 | cut -d= -f2 | tr -d ' '`
if [[ "$result_one" = "$result_two" ]] then
        echo "$p_name" >> matches.out
else
        echo "$p_name" >> no_matches.out
fi
 
i=0
while read v_name
do
compareusernames "$v_name"
((i=$i+1))
done < $file1


}

File1:
Code:
user1
user2
user3
user4

File2:
Code:
name=user1
gud=100
name=user2
gud=200
name=user3
gud=300

File3:
Code:
name=user1
gud=100
name=user2
gud=xxx
name=user3
gud=xxx

# 2  
Old 07-12-2013
Try using this awk program instead:
Code:
awk '
        BEGIN {
                F = "file1"
                while (( getline line < F ) > 0 )
                {
                        A1[line]
                }
                close (F)

                F = "file2"
                while (( getline line < F ) > 0 )
                {
                        n = split ( line, V, "=" )
                        if ( V[2] in A1 )
                        {
                                i = V[2]
                                getline line < F
                                n = split ( line, V, "=" )
                                A2[i] = V[2]
                        }
                }
                close (F)

                F = "file3"
                while (( getline line < F ) > 0 )
                {
                        n = split ( line, V, "=" )
                        if ( V[2] in A1 )
                        {
                                i = V[2]
                                getline line < F
                                n = split ( line, V, "=" )
                                A3[i] = V[2]
                        }
                }
                close (F)
        }
        END {
                for ( k in A1 )
                {
                        if ( A2[k] == A3[k] && A2[k] && A3[k] )
                                print k > "matches.out"
                        else
                                print k > "no_matches.out"
                }

        }
' /dev/null

Let me know how long it took to complete execution.
This User Gave Thanks to Yoda For This Post:
# 3  
Old 07-12-2013
You are reading and processing two data files, through six, processes every line. Not very efficient.

I'd try a language like awk to make recalling the data much easier, but the format of your data files is very difficult too. Is that fixed? Whichever way you choose, it's much easier to just have lines of username gud in them.
# 4  
Old 07-12-2013
I note that the original ksh script (when fixed to remove the syntax errors and properly terminate the function) includes user4 in matches.out and Yoda's awk script includes user4 in no_matches.out. When an entry in File1 does not appear in File2 or File3, should that entry be:
  1. added to matches.out,
  2. added to no_matches.out,
  3. ignored, or
  4. issue a diagnostic saying the entry was not found?
I don't understand why Yoda didn't use FS="=" instead of splitting lines after reading them, but until I know how to handle the issue above, I'm not going to post my awk script.
# 5  
Old 07-23-2013
don_cragun : Good point. These should be sent to a file - something like no_results.out

Also, if there is a null returned after the = in file2 then this should be sent to no_gud.out
# 6  
Old 07-23-2013
step 1
Code:
result_one=`sed -n "/$p_name/ {n;p;q;}" $file2 | cut -d= -f2 | tr -d ' '`

step 2
Code:
result_one=`awk -F= 'm==1 {print $2; exit} $2~/'$p_name'/ {m=1}' $file2 | tr -d ' '`

# 7  
Old 07-23-2013
Quote:
Originally Posted by u20sr
don_cragun : Good point. These should be sent to a file - something like no_results.out

Also, if there is a null returned after the = in file2 then this should be sent to no_gud.out
Your requirements are still ambiguous. The following awk script puts a name found in File1 into one of four files:
  1. in no_results.out if the name does not appear in File2 and does not appear in File3,
  2. in no_gud.out if the name appears in File2 or File3 but not in both, or if the gud=value line in either file has an empty value string,
  3. in no_matches.out if the name appears in File2 and File3 and the value in gud=value in both files is not empty but the values are different, or
  4. in matches.out if the name appears in both files, neither value is empty, and both values are identical.
If this isn't what you want, please restate your exact requirements.
Code:
awk -F= '
FILENAME != lf {
        f++
        lf = FILENAME
}
$1 == "name" {
        u = $2
        next
}
$1 == "gud" {
        f == 1 ? r1[u] = $2 : r2[u] = $2
}
f == 3 {if(!($1 in r1) && !($1 in r2)) print > "no_results.out"
        else    if(!($1 in r1) || r1[$1] == "" ||
                   !($1 in r2) || r2[$1] == "") print > "no_gud.out"
        else    if(r1[$1] != r2[$1]) print > "no_matches.out"
        else    print > "matches.out"
}' File2 File3 File1

As always, if you try running this on a Solaris/SunOS system, use /usr/xpg4/bin/awk, /usr/xpg6/bin/awk, or nawk instead of /bin/awk or /usr/bin/awk.
This User Gave Thanks to Don Cragun For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Shell script reading file slow

I have shell program as below #!/bin/sh echo ======= LogManageri start ========== #This directory is getting the raw data from remote server Raw_data=/opt/ftplogs # This directory is ready for process the data Processing_dir=/opt/processing_dir # This directory is prcoessed files and... (4 Replies)
Discussion started by: Chenchireddy
4 Replies

2. AIX

AIX server running very slow

Hello, All the commands on AIX are running very slow. Below is few stats but I didn't find any issue in cpu or memory reosurces vmstat System configuration: lcpu=4 mem=6144MB ent=1.00 kthr memory page faults cpu ----- -----------... (2 Replies)
Discussion started by: Vishal_dba
2 Replies

3. UNIX for Dummies Questions & Answers

Ubuntu seems running slow!

Hello, My PC seems running slow: OS32 system, Pentium(R)4---2.40Ghz, 1GB RAM, 80GB HDI am running Ubuntu 11.10 (Oneiric Ocelot) alone in this box, which seems very slow to me. Is this normal? Compared with my other PC (Running XP) with 1.99GHz AMD Athlon 3200+, 2GB RAM,500GB HD, XP and Mint... (11 Replies)
Discussion started by: yifangt
11 Replies

4. Shell Programming and Scripting

awk script - reading input lines

Can I do something like, if($0==/^int.*$/) { print "Declaration" } for an input like: int a=5; If the syntax is right, it is not working for me, but I am not sure about the syntax. Please help. Thanks, Prasanna (1 Reply)
Discussion started by: prasanna1157
1 Replies

5. Shell Programming and Scripting

script for reading logs of a script running on other UNIX server

Hi, I have a script, running on some outside firwall server and it's log of success or failure is maintained in a file. I want to write a script which ftp that server and reads that file and checks the logs and if failure , I will send mail notification. Please let meknow if I am not... (1 Reply)
Discussion started by: vandana.parwani
1 Replies

6. Shell Programming and Scripting

Reading multi lines variable using shell script

Hi, I am using #!/bin/sh shell. I have a variable which contains multi line data. I want to read that variable line by line. Can someone post the code to read the multi line variable line by line? Any help is much appreciated. (2 Replies)
Discussion started by: gupt_ash
2 Replies

7. UNIX for Dummies Questions & Answers

Unix is running slow??

All, This is my interview questions. Let me explain the question. Some one is asking me that, the unix server is running very slow. As a unix unix admin, what are the steps we should follow?? What/which process we should check?? What is the way to find the root cause ? Please let me know.... (8 Replies)
Discussion started by: govindts
8 Replies

8. HP-UX

SAM running slow

Any ideas on why SAM would take so long to load and initialize? (4 Replies)
Discussion started by: csaunders
4 Replies

9. SCO

Server running slow

Hi, Wonder is someone can help. I've got a server SCO_SV 3.2v5.0.7 PentIII that is located at a different site and is running slow and has been for a week. I've been speaking to a third party who say nothing is wrong with it but its still running slow. The 3rd party advise it could be a... (2 Replies)
Discussion started by: tez
2 Replies

10. Solaris

Solaris running very slow!!

Hi all, Solaris is working very slow as login to solaris takes time say after 10 to 15 mins we get the login prompt back after logging in as oracle account/other account. This causes most Batch run delays(DWHouse jobs) scheduled through cronjobs. Where should one look for such issues to... (3 Replies)
Discussion started by: a1_win
3 Replies
Login or Register to Ask a Question