Script to parse a file faster


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Script to parse a file faster
# 1  
Old 08-30-2011
Script to parse a file faster

My example file is as given below:

conn=1 uid=oracle
conn=2 uid=db2
conn=3 uid=oracle
conn=4 uid=hash
conn=5 uid=skher
conn=6 uid=oracle
conn=7 uid=mpalkar
conn=8 uid=anarke
conn=1 op=-1 msgId=-1 - fd=104 slot=104 LDAPS connection from 10.10.5.6 to 10.18.6.5
conn=2 op=-1 msgId=-1 - fd=104 slot=104 LDAPS connection from 10.20.35.10 to 10.18.6.5
conn=3 op=-1 msgId=-1 - fd=104 slot=104 LDAPS connection from 10.30.35.19 to 10.18.6.5
conn=4 op=-1 msgId=-1 - fd=104 slot=104 LDAPS connection from 10.40.35.11 to 10.18.6.5
conn=5 op=-1 msgId=-1 - fd=104 slot=104 LDAPS connection from 10.50.35.12 to 10.18.6.5
conn=6 op=-1 msgId=-1 - fd=104 slot=104 LDAPS connection from 10.10.35.14 to 10.18.6.5
conn=7 op=-1 msgId=-1 - fd=104 slot=104 LDAPS connection from 10.20.35.15 to 10.18.6.5
conn=8 op=-1 msgId=-1 - fd=104 slot=104 LDAPS connection from 10.20.35.16 to 10.18.6.5

I need to write a scipt which will grep "uid=oracle" and find the IP address the connection is initiated from
using the connection ID "conn=x"

This is a sample file which I have kind of simplified and the actually file is in GBs.

I tried doing this using the cat and for loop but it takes atleast 2 days to complete the script.

Is there a faster way to do this using perl or awk?

I would like an output something like this:
connid=x IP=w.x.y.z

Any help would certainly be appreciated!
# 2  
Old 08-30-2011
Here's an awk programme that will give you what I think you want:

Code:
#!/usr/bin/ksh

awk '
    /uid=oracle/ { split( $1, a, "=" ); uids[a[2]] = 1; next; }
    /connection from/ {
        split( $1, a, "=" );
        if( uids[a[2]] )
            printf( "connid=%s IP=%s\n", a[2],  $(NF-2) );
    }
' input-file-name

It may take a few minutes to chew a few GiB file, but I think it would be faster than what you've experienced.
# 3  
Old 08-30-2011
Code:
$ ruby -ane 'BEGIN{a=[]}; a<<$F[0] if $F[1]=="uid=oracle"; print "#{$F[0]}: #{$F[9]}\n" if a.include?($F[0]); ' file

# 4  
Old 08-30-2011
Code:
[26/Aug/2011:11:24:20 +0000] conn=9978792 op=1 msgId=2 - SRCH base="ou=people,dc=abc,dc=com" scope=1 filter="(&(objectClass=shadowAccount)(uid=oracle))" attrs="uid userPassword shadowLastChange shadowMax shadowMin shadowWarning shadowInactive shadowExpire shadowFlag"
[26/Aug/2011:11:24:21 +0000] conn=9978793 op=1 msgId=2 - SRCH base="ou=people,dc=abc,dc=com" scope=1 filter="(&(objectClass=shadowAccount)(uid=oracle))" attrs="uid userPassword shadowLastChange shadowMax shadowMin shadowWarning shadowInactive shadowExpire shadowFlag"
[26/Aug/2011:11:24:22 +0000] conn=9978794 op=1 msgId=2 - SRCH base="ou=people,dc=abc,dc=com" scope=1 filter="(&(objectClass=shadowAccount)(uid=oracle))" attrs="uid userPassword shadowLastChange shadowMax shadowMin shadowWarning shadowInactive shadowExpire shadowFlag"
[26/Aug/2011:11:24:23 +0000] conn=9978795 op=1 msgId=2 - SRCH base="ou=people,dc=abc,dc=com" scope=1 filter="(&(objectClass=shadowAccount)(uid=oracle))" attrs="uid userPassword shadowLastChange shadowMax shadowMin shadowWarning shadowInactive shadowExpire shadowFlag"
[26/Aug/2011:11:24:30 +0000] conn=9978802 op=1 msgId=2 - SRCH base="ou=people,dc=abc,dc=com" scope=1 filter="(&(objectClass=shadowAccount)(uid=oracle))" attrs="uid userPassword shadowLastChange shadowMax shadowMin shadowWarning shadowInactive shadowExpire shadowFlag"
[26/Aug/2011:11:24:21 +0000] conn=9978793 op=-1 msgId=-1 - fd=559 slot=559 LDAPS connection from 10.20.13.2:30999 to 10.183.7.45
[26/Aug/2011:11:24:21 +0000] conn=9978793 op=-1 msgId=-1 - SSL 256-bit AES-256
[26/Aug/2011:11:24:21 +0000] conn=9978793 op=0 msgId=1 - BIND dn="" method=128 version=3
[26/Aug/2011:11:24:21 +0000] conn=9978793 op=0 msgId=1 - RESULT err=0 tag=97 nentries=0 etime=0 dn=""
[26/Aug/2011:11:24:21 +0000] conn=9978793 op=1 msgId=2 - SRCH base="ou=people,dc=abc,dc=com" scope=1 filter="(&(objectClass=shadowAccount)(uid=oracle))" attrs="uid userPassword shadowLastChange shadowMax shadowMin shadowWarning shadowInactive shadowExpire shadowFlag"
[26/Aug/2011:11:24:21 +0000] conn=9978793 op=1 msgId=2 - RESULT err=0 tag=101 nentries=1 etime=0
[26/Aug/2011:11:24:22 +0000] conn=9978793 op=2 msgId=0 - RESULT err=80 tag=120 nentries=0 etime=0
[26/Aug/2011:11:24:22 +0000] conn=9978793 op=-1 msgId=-1 - closing from 10.104.15.2:30988 - A1 - Client aborted connection -


Now to modify the script given by you for the file mentioned above, do I make it this:
Code:
#!/usr/bin/ksh  awk '     /uid=oracle/ { split( $3, a, "=" ); uids[a[2]] = 1; next; }     /connection from/ {         split( $3, a, "=" );         if( uids[a[2]] )             printf( "connid=%s IP=%s\n", a[2],  $(NF-2) );     } ' input-file-name


Last edited by pludi; 08-30-2011 at 06:59 PM..
# 5  
Old 08-30-2011
Can you try the below one..? In fact based on the your sample file, the below msgId filter would not be required,however..
Code:
awk '/oracle/{a[$1]=$1;next}/msgId/{if(a[$1]){printf("%s IP=%s\n", a[$1],$(NF-2))}}' inputfile

# 6  
Old 08-30-2011
Thanks a lot agama, I have modified the script to suit my requirement but could you please tell me how the code works.

I am really not able to figure it out.
# 7  
Old 08-30-2011
Glad you were able to get something to work. Some comments that should help explain things:

Code:
#!/usr/bin/ksh awk ' 
    # for each record from the input file, test to see if we should execute each block of code...
    /uid=oracle/ {              # execute this block when the string "uid=orical" is found in the record
        split( $3, a, "=" );    # split the third field into the array a using "=" as the seperator; a[2] is the id
        uids[a[2]] = 1;         # track all ids that we have seen
        next;                   # skip the remainder of the programme, read next record and start processing
    }

    /connection from/ {         # execute this block when "connection from" is in the record
        split( $3, a, "=" );    # split the connection id into array a
        if( uids[a[2]] )        # if we saw this id as an oricle id earlier, uids[id] 
                                #will be non-zero and thus true,  then print the info
            printf( "connid=%s IP=%s\n", a[2], $(NF-2) ); 
    } ' input-file-name

And not to 'backseat mod' here, but please place code-tags around any code or sample in/output. It really helps to have those kinds of information not mashed into a paragraph.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Homework & Coursework Questions

Script parse file

Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted! 1. The problem statement, all variables and given/known data: Hi all, I need help for a script that pulls out a series of numbers from a file (attached file) Basically I... (1 Reply)
Discussion started by: gianvitolinuxs
1 Replies

2. Shell Programming and Scripting

Script parse file Linux

Hi all, I need help for a script that pulls out a series of numbers from a file (attached file) Basically I need a parse to write me in a variable: 9d424312 Can someone help me? Thank you (2 Replies)
Discussion started by: gianvitolinuxs
2 Replies

3. Shell Programming and Scripting

Script to parse bookmarks file

I am using Internet Explorer v10 at work and regularly need to import my personal Firefox bookmarks over. Long story short, I have found the import falling over on any bookmark elements which are over 256 characters. The bookmark file contains bookmarks of this format: <DT><A... (4 Replies)
Discussion started by: ozgadgetguy
4 Replies

4. Shell Programming and Scripting

Script to parse and update a parameter file

Dear All- My requirement is as below, need your inputs please 1] I have a file name Param.txt which contains the below data #GLOBAL# PARAM_VALUE=N ............. ............ CTRY=UK ......... 2] Next, I want to write a script which will check for some condition (lets assume... (1 Reply)
Discussion started by: sureshg_sampat
1 Replies

5. Shell Programming and Scripting

script to parse the properties file

Hi Friends, I have a requirement to parse a properties file having a key=value pairs. i need to count the number of key value pairs in the properties file and iterate through each key-value pair. I have written the script to read the number of lines from the property file, but cannot... (2 Replies)
Discussion started by: prashdeep
2 Replies

6. UNIX for Dummies Questions & Answers

Help to parse csv file with shell script

Hello ! I am very aware that this is not the first time this question is asked here, because I have already read a lot of previous answers, but none of them worked, so... As said in the title, I want to read a csv file with a bash script. Here is a sample of the file: ... (4 Replies)
Discussion started by: Grhyll
4 Replies

7. Shell Programming and Scripting

Bash Shell Script to parse file

Raw Results: results|192.168.2|192.168.2.1|general/udp|10287|Security Note|For your information, here is the traceroute from 192.168.2.24 to 192.168.2.1 : \n192.168.2.24\n192.168.2.1\n\n results|192.168.2|192.168.2.1|ssh (22/tcp)|22964|Security Note|An SSH server is running on this port.\n... (2 Replies)
Discussion started by: jroberson
2 Replies

8. Shell Programming and Scripting

Parse XML file in shell script

Hi Everybody, I have an XML file containing some data and i want to extract it, but the specific issue in my file is that the data is repeated some times like the following example : <section1> <subsection1> X=... Y=... Z=... <\subsection1> <subsection2> X=... Y=... Z=...... (2 Replies)
Discussion started by: yassine
2 Replies

9. UNIX for Advanced & Expert Users

shell script to parse html file

hi all, i have a html file something similar to this. <tr class="evenrow"> <td class="data">added</td><td class="data">xyz@abc.com</td> <td class="data">filename.sql</td><td class="modifications-data">08/25/2009 07:58:40</td><td class="data">Added TK prof script</td> </tr> <tr... (1 Reply)
Discussion started by: sais
1 Replies

10. Shell Programming and Scripting

Help!!! Shell script to parse data file.

I am faced with a :confused: tricky problem to parse a data file ( May not be a tricky problem to the scripting guru's ). Here is what I am faced with. I have a file with multiple rows of data and the rows are not of fixed length. "|" is used as a delimiters for individual columns and each row... (3 Replies)
Discussion started by: yajaykumar
3 Replies
Login or Register to Ask a Question