|
Shell script for parsing 300mb log file..
am relatively new to Shell scripting.
I have written a script for parsing a big file. The logic is:
Apart from lot of other useless stuffs, there are many occurances of <abc> and corresponding </abc> tags. (All of them are properly closed)
My requirement is to find a particular tag (say <data>1234</data>) enclosed anywhere between <abc> </abc> tags.
If found, i have to store 4th line below the <abc> tag in a temp file.
A typical log file looks like:
************************
<pqr>
......
some data
some other data
.........
</pqr>
some text data
...........
<abc>
blah
blah
.....
<id>12345</id>
blah...
......
<data>1234</data>
</abc>
........
.....
.....
<abc>
blah
blah
.....
<id>12345</id>
blah...
...
</abc>
..........
<rst>
...
...
</rst>
some text data...
****************************
OUtput of the script should be <id>12345</id> stored in some temp file.
THe script I am using is:
********************
rm -f temp.log
filename=$1
OK=0
while read line1
do
if [ "$line1" = "<abc>" ]; then
OK=1
fi
if [ "$OK" -eq 1 ]; then
echo $line1 >> temp_file
fi
if [ "$line1" = "</abc>" ]; then
OK=0
fi
if [ "$OK" -eq 0 ] ; then
if [ -f temp_file ]; then
while read line2
do
if [ "$line2" = "<data>1234</data>" ]; then
cat temp_file | awk '{ if ( NR == 4){print($0) } }' >> temp.log
fi
done < temp_file
rm temp_file
fi
fi
done < $filename.log
*******************************
The <abc></abc> tags come in the last portion generally (Not always), somewhere around after 500000 lines... and usually, file has around 700000 lines.
The script runs, and keeps running, and I do find 2 records which are in the initial lines stored in the temp file. But after some 6-7 minutes, script ends abruptly, saying,
scriptname.sh test: argument expected.
Can someone help me out on this?
|