awk comparison with two patterns


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk comparison with two patterns
# 1  
Old 04-03-2013
awk comparison with two patterns

Here is my list, which contains URLs for file downloads:

Code:
//servername.com/version/panasonic1,1_1.1.1
//servername.com/version/panasonic3,1_6.7.1
//servername.com/version/panasonic3,2_6.8
//servername.com/version/panasonic2,6_3.0.2
//servername.com/version/panasonic3,1_7.1.3
//servername.com/version/panasonic2,6_3.0.4

This list been acquired using curl and saved to a text file. The file is used as input for wget and the files are downloaded into a specified directory. I have worked out the curl and wget syntax and have working lines of code.

However I don't need to download all of the files listed in the text file as some are older versions of software for a particular models of hardware.

From the text above models are defined by model,revision (x,x) and the software versions are in the form major.minor.revision (x.x.x) or sometimes major.minor (x.x).

From my reading of the boards and limited knowledge of awk I need to isolate the (model,revision) patterns

Code:
 awk 'BEGIN{FS="/p"; OFS="_"}

and use an if statement to print those which are unique. But where there are other lines which match compare by the version numbers, which means I'll need to redefine FS and OFS to isolate the second pattern.

Code:
{FS=OFS="_"}

The comparison will then only print lines that have the latest software version. From the example lines above that would be:

Code:
//servername.com/version/panasonic1,1_1.1.1
//servername.com/version/panasonic3,2_6.8
//servername.com/version/panasonic3,1_7.1.3
//servername.com/version/panasonic2,6_3.0.4

I think awk is up to the task but my knowledge of it is not. Any ideas would be much appreciated!

Last edited by Scrutinizer; 04-06-2013 at 05:24 PM.. Reason: Changed all icode tags to code tags
# 2  
Old 04-03-2013
Code:
$ awk -F"_" -v OFS="_" ' { if(!arr[$1]) { arr[$1]=$2} else if (arr[$1]<$2) { arr[$1]=$2 } } END { for(i in arr) { print i , arr[i] } } ' file
//servername.com/version/panasonic2,6_3.0.4
//servername.com/version/panasonic3,1_7.1.3
//servername.com/version/panasonic3,2_6.8
//servername.com/version/panasonic1,1_1.1.1

# 3  
Old 04-03-2013
Quote:
Originally Posted by anbu23
Code:
$ awk -F"_" -v OFS="_" ' { if(!arr[$1]) { arr[$1]=$2} else if (arr[$1]<$2) { arr[$1]=$2 } } END { for(i in arr) { print i , arr[i] } } ' file
//servername.com/version/panasonic2,6_3.0.4
//servername.com/version/panasonic3,1_7.1.3
//servername.com/version/panasonic3,2_6.8
//servername.com/version/panasonic1,1_1.1.1

Perfect!

Wow that was quick. Any chance you can explain a little of what you have written. I'm keen to understand rather than just copy and paste code.

Thanks anbu23

---------- Post updated at 12:43 PM ---------- Previous update was at 12:32 PM ----------

In particular I want to make sure that if the folder structure is different:

Code:
//servername.com/version/panasonic3,1_6.7.1
//servername.com/version/panasonic3,2_6.8
//servername.com/version/panasonic2,6_3.0.2
//servername.com/newversion/panasonic3,1_7.1.3

This will still apply the if/else arguments that you have skilfully written.

Last edited by Scrutinizer; 04-06-2013 at 05:24 PM.. Reason: icode to code
# 4  
Old 04-04-2013
if(!arr[$1]) { arr[$1]=$2} else if (arr[$1]<$2) { arr[$1]=$2 }
If First field is not present in array arr then assign second field to array with first field as the index
Else If first field is already present then check second field stored in arr is less than with the second field read from file, then assign second field to arr

Code will work even if you different folder structure
# 5  
Old 04-05-2013
Thanks for explaining that, makes more sense now I see it talked through! I have been running this for a couple of days to check the results and noticed that I have oversimplified the path names when I originally phrased the question. The result is that full path names are not printed and likewise all versions of certain models are being printed.

*hangs head in shame*

Thats clearly my fault for incorrectly phrasing the question. For the task as described the command worked as required Smilie

Here are the differences with filenames:

1. The end of the filename is _Software.bak (This never changes)

//servername.com/version/panasonic2,6_3.0.4 (What I originally posted)
//servername.com/panasonic3/version/panasonic2,6_3.0.4_Software.bak (What the file path actually is)

2. Some of the folders contain part of the software version in a higher folder (panasonic3 or pan6 for example). This means a comparison based on anything before the first _ will not show these two results as unique when in actual fact one is to be printed and the other discarded.

Code:
//servername.com/pan7/version/panasonic3,1_7.1.3_Software.bak
//servername.com/pan6/version/panasonic3,1_6.7.1_Software.bak

I realise I have now rephrased the question. This will make things a more difficult but awk seems a powerful tool. I'll start working on it at my end.

A full file list might look like this:

Code:
//servername.com/pan7/version/panasonic3,1_7.1.3_Software.bak
//servername.com/pan6/version/panasonic3,2_6.8_Software.bak
//servername.com/panasonic1/version/panasonic1,1_1.1.1_Software.bak
//servername.com/pan6/version/panasonic3,1_6.7.1_Software.bak
//servername.com/panasonic3/version/panasonic2,6_3.0.2_Software.bak
//servername.com/pan6/version/panasonic3,2_6.2.3_Software.bak
//servername.com/panasonic3/version/panasonic2,6_3.0.4_Software.bak

And would sort to:

Code:
//servername.com/pan7/version/panasonic3,1_7.1.3_Software.bak
//servername.com/pan6/version/panasonic3,2_6.8_Software.bak
//servername.com/panasonic1/version/panasonic1,1_1.1.1_Software.bak
//servername.com/panasonic3/version/panasonic2,6_3.0.4_Software.bak


Last edited by Scrutinizer; 04-06-2013 at 05:26 PM.. Reason: code tags
# 6  
Old 04-06-2013
Quick and dirty:
Code:
sort file -r -t/ -k5 | awk -F_ '{revision=substr($1,length($1)-2,3)} arr[revision]++==0 {print}'

This sorts from field 5 onward; delimiter is ,
I.e. */*/*/*/field5/field6
Different directory depths would give wrong results.
Then awk prints the first occurrence of each revision.
The revision is assumed to be the last 3 characters of field 1; delimiter is _.
The revisions and versions must be x,x and x.x.x; e.g. an xx.x.x would give wrong results.

---------- Post updated at 03:26 PM ---------- Previous update was at 03:08 PM ----------

And a variant that is safe against different directory depths:
Code:
awk -F_ '
{revision=substr($1,length($1)-2,3)}
$2>ar2[revision] {ar2[revision]=$2; ar1[revision]=$0}
END {for(revision in ar1) print ar1[revision]}
' file


Last edited by MadeInGermany; 04-06-2013 at 05:14 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk comparison of dates

I need to use awk to return lines in multiple files that contain a date between a start date and end date. The format of the date is as seen in column 3 in the following line. A,1458147240,Mar 30 2015 12:54:00PM,s15u4chn ,2,GPS Major Alarm `clear`,component.Channel,10,15,0,138,183,,,Mar 16... (4 Replies)
Discussion started by: randman1
4 Replies

2. Shell Programming and Scripting

awk if comparison with variable

Hi , Need help, p for value in `awk -F, '{print $1 }' ad | uniq` do x=$(echo $value) echo $x echo `awk -F, '{if( $1 == $x) sum = sum + $8 } END{print sum}' ad` --- not working echo `awk -F, '{if($1 == “MT”) sum = sum + $8 } END{print sum}' ad` -- Working but hard coded done; ad... (4 Replies)
Discussion started by: nadeemrafikhan
4 Replies

3. Shell Programming and Scripting

File comparison using awk

Hi All, i have two files file1 ,file 2 file 1 col1|col2|col3|col4|col5|col6|col7|col8 11346925|0|2009-09-20|9999-12-31|100|0 11346925|0|2009-09-20|9999-12-31|120|0 12954311|0|2009-09-11|9999-12-31|100|0 12954311|0|2009-07-23|2999-12-31|120|0 12954312|0|2009-09-11|9999-12-31|100|0... (9 Replies)
Discussion started by: mohanalakshmi
9 Replies

4. Shell Programming and Scripting

awk comparison not working

Can you please help me on belw awk comparsion which doest not work cat employee_list NAME Last-login Jack 03/25/2013 Maneul 03/26/2013 Eric 03/26/2013 Samuel 03/28/2013 loak 03/29/2013 zac 03/29/2013 this is my awk .. it gives me error cat employee_list | awk '(($2=='date... (3 Replies)
Discussion started by: Sara_84
3 Replies

5. Shell Programming and Scripting

Need help with simple comparison in AWK

Hi, I'm new to AWK and I'm having problems comparing a field to a string variable. /ARTIST/ {x = $2} $1 ~ x {print $0}My code tries to find a record with the string "ARTIST". Once it finds it, it stores the second field of the record into a variable. I don't know what the problem is for the... (7 Replies)
Discussion started by: klusps
7 Replies

6. Shell Programming and Scripting

File comparison using awk

my files are as follows fileA sepearated by tab /t 00 lieferungen 00 attractiop 01 done 02 forness 03 rasp 04 alwaysisng 04 funny 05 done1 fileB funnymou120112 funnymou234470 mou3raspnhdhv rddfgmoudone1438748 so all those record which are greater than 3 and which are not... (6 Replies)
Discussion started by: rajniman
6 Replies

7. Shell Programming and Scripting

awk comparison

Hello all, Probably a very simple question, I am stuck with a small part of a code: I am trying to do a comparison to get the maximum value of column 6 if columns 1, 4 and 5 of two or more rows match. Here is what I am doing: awk -F'\t' '{if ($6 > a)a=$6}END{for (i in a) print i"\t"a}' ... (4 Replies)
Discussion started by: jaysean
4 Replies

8. Shell Programming and Scripting

Comparison and editing of files using awk.(And also a possible bug in awk for loop?)

I have two files which I would like to compare and then manipulate in a way. File1: pictures.txt 1.1 1.3 dance.txt 1.2 1.4 treehouse.txt 1.3 1.5 File2: pictures.txt 1.5 ref2313 1.4 ref2345 1.3 ref5432 1.2 ref4244 dance.txt 1.6 ref2342 1.5 ref2352 1.4 ref0695 1.3 ref5738 1.2... (1 Reply)
Discussion started by: linuxkid
1 Replies

9. Shell Programming and Scripting

awk comparison with variable

hi, I want to compare i variable in the awk statement but its not working out. Pl help me out If we do the comparison like this its OK, cat sample | awk -F" ", '{if ($1=="1-Sep-2009") print $1,$2,$3,$4,$5}' But if u use a variable instead of "1-Sept-2009", it does not return anything,... (2 Replies)
Discussion started by: asadlone
2 Replies

10. UNIX for Dummies Questions & Answers

multiple comparison in awk

I have an input file. Each line in it has several characters. If the first three characters of the line is '000' or '001' or '002' or '003', I need to print it in output. How can I do this in awk. I am able to do if the search string is only one (let us say 000). cat <filename> | awk... (1 Reply)
Discussion started by: paruthiveeran
1 Replies
Login or Register to Ask a Question