Performance issue with awk script.


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Performance issue with awk script.
# 1  
Old 07-08-2008
PHP Performance issue with awk script.

Hi,

The below awk script is taking about 1 hour to fetch just 11 records(columns). There are about 48000 records. The script file name is take_first_uniq.sh

Code:
#!/bin/ksh  

if [ $# -eq 2 ] 
then  

while read line 
do 
first=`echo $line | awk -F"|" '{print $1$2$3}'`
while read line2
do
second=`echo $line2 | awk -F"|" '{print $7$13$14}'`
if [ ${first} == ${second} ] 
then 
echo $line2
fi 
done < $2

done < $1  
fi

I call this script this way..

Code:
ksh take_first_uniq.sh file_3uniq_fields.out file_sort_all_fields.out > file_uniq_master.out


Please suggest me how to increase the performance.. I'm new to awk scripting.

Thanks,
RRVARMA
# 2  
Old 07-08-2008
Try something like the following (which is untested since you did not post sample of your data files).
Code:
#!/bin/ksh

[[ $# != 2 ]] && exit 1

IFS="|"
while read v1 v2 v3 rest
do
    first="${v1}${v2}${v3}"
    while read v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13 v14 rest
    do
        [[ ${first} == "${v7}${v13}${v14}" ]]  && print $v1 $v2 $v3 $v4 $v5 $v6 $v7 $v8 $v9 $v10 $v11 $v12 $v13 $v14 $rest
    done < $2
done < $1

exit 0

# 3  
Old 07-08-2008
Quote:
The below awk script[...]
This is not an AWK script, this is a shell script that includes some AWK code.

If you post sample input and the desired output we could try to help ...
# 4  
Old 07-10-2008
sample records.

Hi fpmurphy & radoulov,

Thanks for the feed back.

These are the sample records for first file file_3uniq_fields.out

Code:
1TVAO|OVEPT|VO
1TVAO|OVPDM|VO
6NFXE|17CLP|DH
6NFXE|NRZO4|EQ
6NFXE|SMOSA|EQ
ACA15|11X1W|DX
ACA15|1LN88|DX
ACA15|1LNSK|DX
ACA15|1LNVX|DX
ACA15|1LNVX|FD

and here are the sample records for second file.. file_sort_all_fields.out

Code:
1TVAO|S3zS033306|4577777770|4513201000|AJBFGJ|CB10|1TVAO|S3WS033306|4513101000|4513201000|AJBFGJ|CB10|OVEPT|VO|430300|430300|430300|009|IC    |Z|N|Y|IS
1TVAO|S3zS033306|4515685200|4513201000|AJBFGJ|CB10|1TVAO|S3WS033306|4513101000|4513201000|AJBFGJ|CB10|OVPDM|VO|430300|430300|430300|009|IC    |Z|N|Y|IS
6NFXE|S3Sr021401|4522451000|4511201000|B7BXHT|CB10|6NFXE|S3SN021401|4511101000|4511201000|B7BXHT|CB10|17CLP|DH|******|6670NI|410402|011|LQ    |Z|A|Y|IS
AGRJE|NA|NA|NA|NA|NA|6NFXE|S3SN021401|4511101000|4511201000|B7BXHT|CB10|NRZO4|EQ|402100|6670DC|410402|001|EQ|Z|U|Y|VT
6NFXE|S3Sz021401|4522201000|4511201000|B7BXHT|CB10|6NFXE|S3SN021401|4511101000|4511201000|B7BXHT|CB10|SMOSA|EQ|******|6670NI|410402|016|EQ    |Z|U|Y|IS
ACA15|S3Bz100120|4522201000|4511201000|AEBDHZ|CB10|ACA15|S3BW100120|4511101000|4511201000|AEBDHZ|CB10|11X1W|DX|410312|410312|410312|011|LQ    |Z|A|Y|IS
ACA15|S3BW100120|4512541000|4511201000|AEBDHZ|CB10|ACA15|S3BW100120|4511101000|4511201000|AEBDHZ|CB10|1LN88|DX|410312|410312|410312|A14|IOC   |Z|N|Y|IS
ARCXE|NA|NA|NA|NA|NA|ACA15|S3BW200120|4511101000|4511201000|AEBDHZ|CB10|1LN88|DX|410312|420100|420100|A14|IOC   |Z|N|Y|IS
ACA15|NA|NA|NA|NA|NA|ACA15|NA|NA|NA|NA|NA|1LNSK|DX|410312|410312|410312|A14|TC    |Z|N|Y|IS
ACA15|NA|NA|NA|NA|NA|ACA15|NA|NA|NA|NA|NA|1LNVX|DX|410312|410312|410312|009|IOC   |Z|N|Y|IS

Thanks,
RRVARMA
# 5  
Old 07-10-2008
... and how the desired output (file_uniq_master.out) should look like?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Performance Issue - Shell Script

Hi, I am beginner in shell scripting. I have written a script to parse file(s) having large number of lines each having multiple comma separated strings. But it seems like script is very slow. It took more than 30mins to parse a file with size 120MB (523564 lines), below is the script code ... (4 Replies)
Discussion started by: imrandec85
4 Replies

2. Shell Programming and Scripting

Performance issue in shell script

Hi All, I am facing performance issue while rinning the LINUX shell script. I have file1 and file 2. File one is the source file and file 2 is lookup file. Need to replace if the pattern is matching in file1 with file2. The order of lookup file is important as if any match then exit... (8 Replies)
Discussion started by: ureddy
8 Replies

3. UNIX for Dummies Questions & Answers

awk script performance issue

Hello All, I have the below excerpt of code in my shell script and it taking long time to complete, though it prints the output quickly. Is there a way to make it come out once it finds the first instance as the file size of 4.7 GB it could be going through all lines of the data file to find for... (3 Replies)
Discussion started by: Ariean
3 Replies

4. Shell Programming and Scripting

awk performance issue

Hi, I have the code below as cat <filename> | tr '~' '\n' | sed '/^$/ d' | sed "s/*/|/g" > <filename> awk -F\| -vt=`date +%m%d%y%H%M%S%s` '$1=="ST",$1=="SE"{if($1=="ST"){close(f);f="214_edifile_"t"" ++i} ; $1=$1; print>f}' OFS=\| <filename> This script replaces some characters and... (4 Replies)
Discussion started by: atlantis_yy
4 Replies

5. UNIX for Dummies Questions & Answers

Performance issue

hi I am having a performance issue with the following requirement i have to create a permutation and combination on a set of three files such that each record in each file is picked and the output is redirected in a specific format but it is taking around 70 odd hours to prepare a combination... (7 Replies)
Discussion started by: mad_man12
7 Replies

6. Shell Programming and Scripting

Script performance issue

hi i have written a shell script which comapare a text file data with files within number of different directories. example. Text File: i have a file /u02/abc.txt which have almost 20000 file names Directories: i have a path /u03 which have some subdirectories like a,b,c which have almost... (2 Replies)
Discussion started by: malikshahid85
2 Replies

7. Solaris

Performance issue

Hi Gurus, I am beginner in solaris and want to know what are the things we need to check for performance monitoring on our solairs OS. for DISK,CPU and MEMORY. Also how we do ipforwarding in slaris Many thanks for your help Pradeep P (4 Replies)
Discussion started by: ppandey21
4 Replies

8. UNIX for Advanced & Expert Users

FTP-Shell Script-Performance issue

Hello All, Request any one of Unix/Linux masters to clarify on the below. How far it is feasible to open a new ftp connection for transferring each file when there are multiple files to be sent. I have developed shell script to send all files at single stretch but some how it doesnt suit to... (3 Replies)
Discussion started by: RSC1985
3 Replies

9. Shell Programming and Scripting

Performance issue with ftp script.

Hi All, I have written a script to FTP files from local server to remote server. When i try it for few number of files the scripts runs successfully. But the same script when i run for 200-300 files it gives me performanace issue by aborting the connection. Please help me out to improve the... (7 Replies)
Discussion started by: Shiv@jad
7 Replies

10. Shell Programming and Scripting

performance issue using gzcat, awk and sort

hi all, I was able to do a script to gather a few files and sort them. here it is: #!/usr/bin/ksh ls *mainFile* |cut -c20-21 | sort > temp set -A line_array i=0 file_name='temp' while read file_line do line_array=${file_line} let i=${i}+1 (5 Replies)
Discussion started by: naoseionome
5 Replies
Login or Register to Ask a Question