awk script (complex)


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk script (complex)
# 15  
Old 03-12-2012
yes and no.

8 of the 10 were >20

however

2 out of the 10 were <20 but were genuinely new

so it looks like it's printing everything off as new.

also comparing against the initial code in my first post on this thread it has missed off one genuine new line, I would say this is down to the -vLOOK=50 variable

having amended the LOOK to 6000 it has still missed the line out.

---------- Post updated at 05:34 AM ---------- Previous update was at 04:26 AM ----------

Quote:
Originally Posted by Chubler_XL
Are the numbers your getting on the NEW: lines bigger than 20? I'm still a bit confused about what NEW lines should be, if it's just records that only appear in the most recent file then this might work better:

Code:
cd /path/to/cisco/logs
files=`ls ciscostats_* | sort -t_ -k2.5 -k2.3,2.4 -k2.1,2.2`
first=`echo "$files" | tail -1`
 
awk -F, -vLOOK=50 -vMATCH=20 '
  FNR==1{F++}F==1{a[$1","$2","$3]++;next}
  a[$1","$2","$3]&&F<LOOK{b[$1","$2","$3]++}
  a[$1","$2","$3]{c[$1","$2","$3]++}
  END{for(i in c)if(b[i]>MATCH)print i";\t\t"b[i];else if(c[i]==a[i])print "NEW:"i";\t\t"c[i]}' $first $files



interestingly the "missing" new line appears with your new code, but now only this line appears?

by a new line, I mean

if(b[i]-1&&a[i]!=b[i])
if(b[i]-1)
## if ["element that i indexed of b array's count - 1"] has a value
## so there must be at least one record
a[i]!=b[i]
## if "i indexed element of b array's count" and "i indexed element of a array's count" is not equal
## so trying to be sure that is there a record in the other files?
## if not equal then there is a record in the other files
## so it is a OLD line
## else it will be a NEW line

Last edited by slashbash; 03-12-2012 at 01:31 AM..
# 16  
Old 03-12-2012
OK think I have it now:

Code:
cd /path/to/cisco/logs
files=`ls ciscostats_* | sort -t_ -k2.5r -k2.3,2.4r -k2.1,2.2r`
awk -F, -vLOOK=50 -vMATCH=20 '
   FNR==1{F++}F==1{a[$1","$2","$3]++;next}
   {i=$1","$2","$3;if(!(i in a))next}
   F<=LOOK{b[i]++}
   {c[$1","$2","$3]++}
   END{for(i in a)if(b[i]>0&&a[i]+b[i]>=MATCH){print i";\t\t"a[i]+b[i]}else if(c[i]+0==0)print "NEW:"i";\t\t"a[i]}' $files

This User Gave Thanks to Chubler_XL For This Post:
# 17  
Old 03-12-2012
no.

I am getting all new lines printed off as 1, some these could be more then 1 for example 2+ and still be new, plus in code we are not comparing against all records for new lines to be printed off just against 50 (I know this is variable but could we not incorporate this check)

The first script has it to a tea i.e compares current file against everything then prints off new lines ok, just problem is I need it to also check current file against 3 months worth of files then print off >20
# 18  
Old 03-12-2012
Perhaps I'm misunderstanding your requirement.

I used LOOK=2 and MATCH=3 for these files:
Code:
*** ciscostats_08032012 ***
B,2,1
C,1,1
D,5,5
*** ciscostats_09032012 ***
B,2,1
B,2,1
*** ciscostats_10032012 ***
A,1,1
A,1,1
B,2,1
D,5,5

and this is the output I get/expect:
Code:
B,2,1;          3
NEW:A,1,1;              2

If this is wrong ,perhaps you could supply a sample file set with low MATCH/LOOK counts that demonstrate what you want.
# 19  
Old 03-12-2012
Code:
nawk -F, 'NR==FNR{a[$1OFS$2OFS$3]++;next} a[$1OFS$2OFS$3]{b[$1OFS$2OFS$3]++}
END{for(i in b){if(b[i]-1&&a[i]!=b[i]){print i";\t\t"b[i]}else{print "NEW:"i";\t\t"b[i]} } }' OFS=, ciscostats_10032012 *.csv | sort -r

above code compares all file lines with NR==FNR

old repeat lines are dumped into array b where indexed lines are incremented.

It also prints off any new indexed lines in array a with an increment after comparing to array b, where no match is found then it must be new.

I think we can modify both these scripts in order to serve the purpose, my only question would be can we run the scripts simultaneously which is what I want?

i.e the script above can be modified to only produce the new lines (and we can remove some of the unnecessary bits i.e the repeat incremental lines from array b (but prob still need to keep this array in order to do the new line comparison with array a, if you understand the logic)

we can use your script with LOOK and MATCH variables to compare the last 3 month records anything >20

think this is possible..

Last edited by slashbash; 03-12-2012 at 08:14 PM..
# 20  
Old 03-12-2012
The code you supplied produces the following output for my 3 test files:
Code:
D,5,5;          2
B,2,1;          4
NEW:A,1,1;              2

We have simplified your requirement (1.) to "only look at the first 2 files" (ie with a LOOK value of 2) and this will change the output to:
Code:
NEW:D,5,5;              1
B,2,1;          3
NEW:A,1,1;              2

Requirement (2.) that NEW should check all available files (i.e. ciscostats_08032012 is checked as well) will produce:

Code:
B,2,1;          3
NEW:A,1,1;              2

This is because "D,5,5" is in ciscostats_08032012, so it's not new.

This output matches the output of the script I supplied in post #16, you have said that #16 is wrong but I still can't see what it's doing that you dont like.

---------- Post updated at 11:50 AM ---------- Previous update was at 09:27 AM ----------

Looking back over this thread, I suspect you are reading the code I have supplied, and determining it's not doing what you want. Rather than trying it out with actual data, so it's probably time for me to explain what it does:

$files is populated with a list of data files with the most recent first eg:
ciscostats_02012012
ciscostats_01012012
ciscostats_31122011

a[] contains a count of how many times each ID appears in the first (most recent) file.

b[] contains a count of how many times an ID from a[] appears in files 2 thru LOOK

c[] contains a count of how many times an ID from a[] appears in any other file

At the end we print any ID that appears in both a[] and b[], and has a[]+b[] count >= MATCH
otherwise, a "NEW" record is output if value appears in a[] and not in c[]
# 21  
Old 03-12-2012
Code:
[V490]#files=`ls ciscostats* | sort -t_ -k2.5r -k2.3,2.4r -k2.1,2.2r`
   {c[$1","$2","$3]++}
   END{for(i in a)if(b[i]>0&&a[i]+b[i]>=MATCH){print i";\t\t"a[i]+b[i]}else if(c[i]+0==0)print "NEW:"i";\t\t"a[i]}' $files[V490]#nawk -F, -vLOOK=60 -vMATCH=20 '
>    FNR==1{F++}F==1{a[$1","$2","$3]++;next}
>    {i=$1","$2","$3;if(!(i in a))next}
>    F<=LOOK{b[i]++}
>    {c[$1","$2","$3]++}
>    END{for(i in a)if(b[i]>0&&a[i]+b[i]>=MATCH){print i";\t\t"a[i]+b[i]}else if(c[i]+0==0)print "NEW:"i";\t\t"a[i]}' $files
NEW:NREE_CISCO3750,10,2          1

I know for sure there has been more then 1 new line, I have even compared back vLOOK 600
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk in complex number data

Hi, I'm trying to transform my data from the following format: eps:, 0.248281687841641, -2.83539034165844e-7, 2.78042576353472+6.3505226053266e-6i to this: eps:, 0.248281687841641, -2.83539034165844e-7, 2.78042576353472, +6.3505226053266e-6 so I can plot it with GnuPlot. how do I... (4 Replies)
Discussion started by: rogeriogouvea
4 Replies

2. Shell Programming and Scripting

Building a complex xml using awk

Hi I have a complex xml to be build using awk using a lookup file values. Below is the xml <country name="xyz"> <state name="abc"> <city name="qwe" capital="yes"/> <city name="asd" capital="no"/> </state> <state name="qrq"> <city name="rthy" capital="yes"/> <state> </country> ... (1 Reply)
Discussion started by: Nevergivup
1 Replies

3. Shell Programming and Scripting

Complex awk problem

hello, i have a complex awk problem... i have two tables, one with a value (0 to 1) and it's corresponding p-value, like this: 1. table: ______________________________ value p-value ... ... 0.254 0.003 0.245 0.005 0.233 0.006 ... ... ______________________________ and a... (6 Replies)
Discussion started by: dietmar13
6 Replies

4. Shell Programming and Scripting

Complex transpose awk script

Hello to all in forum, Maybe an awk expert could help me with this complex task for me. I have the input shown below and I would like to get the output as follow: - I would like the output separated by commas. - The header is fixed and will be the same always. - For the lines containing... (22 Replies)
Discussion started by: Ophiuchus
22 Replies

5. Shell Programming and Scripting

Help with Complex Awk.

Hi, I have a file. In this file when ever the word "ABC" occurs at position from 25 and 34 I would like to replace the value at postion 100 to 5 for the first 1000 rows only. I have written the following Awk command. nawk 'substr($0,25,9)=="ABC" {print $0}' filename The above command... (4 Replies)
Discussion started by: pinnacle
4 Replies

6. Shell Programming and Scripting

Complex match of numbers between 2 files awk script

Hello to all, I hope some awk guru could help me. I have 2 input files: File1: Is the complete database File2: Contains some numbers which I want to compare File1: "NUMBERKEY","SERVICENAME","PARAMETERNAME","PARAMETERVALUE","ALTERNATENUMBERKEY"... (9 Replies)
Discussion started by: Ophiuchus
9 Replies

7. Shell Programming and Scripting

complex Awk Question

Hi, I have a file look likes this : --->start hir Trace file: pudwh_ora_9998.trc Sort options: fchela exeela ***************************************************************count = number of times OCI procedure was executed cpu = cpu time in seconds executing elapsed = elapsed... (3 Replies)
Discussion started by: yoavbe
3 Replies

8. Shell Programming and Scripting

Sorting complex file with awk

i have a file ddd.txt its delimiter is : but has , and "" within each column as below and also each line ends with ; I_EP,"29":I_US,"120":I_P_ID,"2020":I_NEW,"600":I_OLD,"400":I_POW,"4.5":I_NAME,"TOM";... (9 Replies)
Discussion started by: blackzinga80
9 Replies

9. Shell Programming and Scripting

Complex use with awk

Hi , I have file named docs.txt The content of the file look like this: DOC disk location Size ======= ===== ============= ========= TXT A /dev/dm-1 10 TXT B /dev/dm-2 10 BIN C ... (3 Replies)
Discussion started by: yoavbe
3 Replies

10. Shell Programming and Scripting

Complex Sed/Awk Question?

Hello, So i have this file called /apps/turnout which looks like that of the contents of the /etc/shadow (but not exactly) the file has a long list in it. basically, the contents of this file looks something similar to the following: jajajajalala:D#$#AFVAdfda lalabavisof:#%R@fafla#$... (3 Replies)
Discussion started by: SkySmart
3 Replies
Login or Register to Ask a Question