File Parsing


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting File Parsing
# 1  
Old 09-24-2007
File Parsing

Hi All,
I have a couple of files ( ascii ) with the following data

File 1
#lport1:dc1:lport2:dc2 - All records were delimited by :
6300:ADEF12:6305:ATNE59
3411:EGFE31:3499:GDEF21
. . . .
. . . .
total of 55,000 Records

File 2
#seqno:lport1:id:dlc1:vid:lport2:nni:dc2:ci - All records delimited by :
60568:3411:98:EGFE31:965:3499:3799:GDEF21:432
. . . . . . . . .
. . . . . . . . .
total of 58,000 Records

I need to Compare lport1, dc1, lport2, dc2 values of file1 with lport1, dc1, lport2, dc2 values of file2 and if there is a match, I need to write the entire line in file2 to another file. I tried writing a Perl script under solaris 2.5.8 which took almost 6 hours to finish.
Could anyone of you help me in getting this task run pretty fast i.e, less than 15 minutes using awk/shell script..
Thanks in Advance.
# 2  
Old 09-24-2007
Assuming:
Code:
File 2
#seqno:lport1:id:dlc1:vid:lport2:nni:dc2:ci – All records delimited by :

actually means:
Code:
File 2
#seqno:lport1:id:dc1:vid:lport2:nni:dc2:ci – All records delimited by :

nawk -f jsusheel.awk file1 file2

jsusheel.awk:
Code:
BEGIN {
   FS=OFS=":"
}
NR==FNR { f1[$1, $2, $3, $4]; next }
($2 SUBSEP $4 SUBSEP $6 SUBSEP $8) in f1

# 3  
Old 09-24-2007
File Parsing

Hi Vgersh99,
thanks for the reply. Yes your assumption is correct. It should be dc1 instead of dlc1. Sorry for the typo error.
When i executed the awk script there was no matching output. The body starting with NR==FNR works perfect by reading all the input records from the file1. I just verified using print $0
However i do not have any clue wrt the line ($2 SUBSEP $4 SUBSEP $6 SUBSEP $8 ) in f1. Could you please help me in deciphering this line as i am not much comfortable to awk.
Also please note that a record in file1 will not match a record in file2 on a one to one basis i.e.,the first record in file1 may match 100th record in file2 and the second record in file1 may match 40123th record in file2.
Again i thank you for sparing your time...
# 4  
Old 09-25-2007
A easy but not reasonable one

Hi,
I have an idea about your reqs, but it maybe very slow when the file contains too much records.
Just for your reference.Smilie

Input:
Code:
first.txt:
1:a:2:b
3:c:4:d
5:e:6:f
7:g:8:h

second.txt:
60568:1:98:a:965:2:3799:b:432
60568:1:98:f:965:2:3799:b:432
60568:3:98:c:965:4:3799:d:432
60568:3:98:c:965:4:3799:w:432
60568:5:98:e:965:6:3799:f:432

Output:
Code:
60568:1:98:a:965:2:3799:b:432
60568:3:98:c:965:4:3799:d:432
60568:5:98:e:965:6:3799:f:432

Code:

Code:
awk 'BEGIN{FS=":"}
{
if (NF<=4)
pre[NR]=$0
else
{
a=sprintf("%s:%s:%s:%s",$2,$4,$6,$8)
for (i in pre)
if (pre[i]==a)
print $0
}
}' first.txt second.txt

# 5  
Old 09-25-2007
f1:
Code:
6300:ADEF12:6305:ATNE59
3411:EGFE31:3499:GDEF21

f2:
Code:
60568:3411:98:EGFE31:965:3499:3799:GDEF21:432
60568:3422:98:EGFE31:965:3499:3799:GDEF21:432

produces:
Code:
60568:3411:98:EGFE31:965:3499:3799:GDEF21:432

Looks good to me given your original description of the fields and the matching criteria.

The '($2 SUBSEP $4 SUBSEP $6 SUBSEP $8 )' is the field matching key for file2 - fields 2,4,6 and 8 'concatenated' from file2 records/line represent a matching key to be used to look up in the associative array 'f1'.
# 6  
Old 09-25-2007
File Parsing

Hi,
Many thanks to Summer_cherry and vgresh99 for the responses.
Again these scripts consume lot of cpu utilization and takes longer
to complete. I have desided to run these scripts by midnight.
thanks a lot ...
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

File Parsing

Hi Gurus, i have files like this and i want to rename it. server1_0_Log0000597500 server1_0_Log0000597501 server1_0_Log0000597502 server1_0_Log0000597503 server1_0_Log0000597504 server1_0_Log0000597505 server1_0_Log0000597506 server1_0_Log0000597507 server1_0_Log0000597508... (7 Replies)
Discussion started by: fedora132010
7 Replies

2. Shell Programming and Scripting

parsing data from a big file using keys from another smaller file

Hi, I have 2 files format of file 1 is: a1 b2 a2 c2 d1 f3 format of file 2 is (tab delimited): a1 1.2 0.5 0.06 0.7 0.9 1 0.023 a3 0.91 0.007 0.12 0.34 0.45 1 0.7 a2 1.05 2.3 0.25 1 0.9 0.3 0.091 b1 1 5.4 0.3 9.2 0.3 0.2 0.1 b2 3 5 7 0.9 1 9 0 1 b3 0.001 1 2.3 4.6 8.9 10 0 1 0... (10 Replies)
Discussion started by: Lucky Ali
10 Replies

3. Shell Programming and Scripting

Parsing of file for Report Generation (String parsing and splitting)

Hey guys, I have this file generated by me... i want to create some HTML output from it. The problem is that i am really confused about how do I go about reading the file. The file is in the following format: TID1 Name1 ATime=xx AResult=yyy AExpected=yyy BTime=xx BResult=yyy... (8 Replies)
Discussion started by: umar.shaikh
8 Replies

4. Shell Programming and Scripting

Parsing file, yaml file? Extracting specific sections

Here is a data file, which I believe is in YAML. I am trying to retrieve just the 'addon_domains" section, which doesnt seem to be as easy as I had originally thought. Any help on this would be greatly appreciated!! I have been trying to do this in awk and mostly bash scripting instead of perl... (3 Replies)
Discussion started by: Rhije
3 Replies

5. UNIX for Dummies Questions & Answers

Script for parsing details in a log file to a seperate file

Hi Experts, Im a new bee for scripting, I would ned to do the following via linux shell scripting, I have an application which throws a log file, on each action of a particular work with the application, as sson as the action is done, the log file would vanish or stops updating there, the... (2 Replies)
Discussion started by: pingnagan
2 Replies

6. Shell Programming and Scripting

File Parsing Help

Hello, I have a file which contains groups of fields. These groups are separated by a blank line, to form a logical record. Each line consists of a field-value pair. If want to find all records where field 'd' has a value of '4' and if it does, I want the value of field 'a' (from the... (4 Replies)
Discussion started by: brawnr
4 Replies

7. Shell Programming and Scripting

Perl parsing compared to Ksh parsing

#! /usr/local/bin/perl -w $ip = "$ARGV"; $rw = "$ARGV"; $snmpg = "/usr/local/bin/snmpbulkget -v2c -Cn1 -Cn2 -Os -c $rw"; $snmpw = "/usr/local/bin/snmpwalk -Os -c $rw"; $syst=`$snmpg $ip system sysName sysObjectID`; sysDescr.0 = STRING: Cisco Internetwork Operating System Software... (1 Reply)
Discussion started by: popeye
1 Replies

8. Shell Programming and Scripting

need help in Parsing a CSV file and generate a new output file

Hi Scripting Gurus, I am trying to parse a csv file and generate a new output file. The input file will be a variable length in turns of rows and columns. output file will have 8 columns. we have three columns from the header for each set. just to give little bit more clarification each row... (15 Replies)
Discussion started by: vkr
15 Replies

9. Shell Programming and Scripting

Finding & Moving Oldest File by Parsing/Sorting Date Info in File Names

I'm trying to write a script that will look in an /exports folder for the oldest export file and move it to a /staging folder. "Oldest" in this case is actually determined by date information embedded in the file names themselves. Also, the script should only move a file from /exports to... (6 Replies)
Discussion started by: nikosey
6 Replies

10. Shell Programming and Scripting

Help me with parsing this file

Hi, I need a shell script that would parse this file /usr/share/i18n/locales/aa_DJ:title "Afar language locale for Djibouti (Cadu/Laaqo Dialects)." /usr/share/i18n/locales/aa_ER:title "Afar language locale for Eritrea (Cadu/Laaqo Dialects)." /usr/share/i18n/locales/aa_ER@saaho:title... (2 Replies)
Discussion started by: eamani_sun
2 Replies
Login or Register to Ask a Question