The UNIX and Linux Forums  


Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com




View Single Post in the UNIX and Linux Forums - Click on the Thread or Permalink to View Entire Thread -->
  #1 (permalink)  
Old 02-08-2007
gurpreet470 gurpreet470 is offline
Registered User
  
 

Join Date: Feb 2007
Posts: 4
Shell script for parsing 300mb log file..

am relatively new to Shell scripting.
I have written a script for parsing a big file. The logic is:
Apart from lot of other useless stuffs, there are many occurances of <abc> and corresponding </abc> tags. (All of them are properly closed)
My requirement is to find a particular tag (say <data>1234</data>) enclosed anywhere between <abc> </abc> tags.
If found, i have to store 4th line below the <abc> tag in a temp file.

A typical log file looks like:

************************
<pqr>
......
some data
some other data
.........
</pqr>
some text data
...........
<abc>
blah
blah
.....
<id>12345</id>
blah...
......
<data>1234</data>
</abc>
........
.....
.....

<abc>
blah
blah
.....
<id>12345</id>
blah...
...
</abc>
..........
<rst>
...
...
</rst>
some text data...

****************************

OUtput of the script should be <id>12345</id> stored in some temp file.

THe script I am using is:

********************

rm -f temp.log
filename=$1
OK=0

while read line1
do


if [ "$line1" = "<abc>" ]; then
OK=1
fi

if [ "$OK" -eq 1 ]; then
echo $line1 >> temp_file
fi

if [ "$line1" = "</abc>" ]; then
OK=0
fi

if [ "$OK" -eq 0 ] ; then

if [ -f temp_file ]; then

while read line2

do

if [ "$line2" = "<data>1234</data>" ]; then

cat temp_file | awk '{ if ( NR == 4){print($0) } }' >> temp.log

fi

done < temp_file

rm temp_file

fi

fi

done < $filename.log

*******************************

The <abc></abc> tags come in the last portion generally (Not always), somewhere around after 500000 lines... and usually, file has around 700000 lines.

The script runs, and keeps running, and I do find 2 records which are in the initial lines stored in the temp file. But after some 6-7 minutes, script ends abruptly, saying,
scriptname.sh test: argument expected.

Can someone help me out on this?