awk: matching and not matching


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk: matching and not matching
# 1  
Old 11-11-2011
awk: matching and not matching

Hello all,

simple matching and if not match problem that i can't figure out.
Code:
file1
hostname:
30 10 * * * /home/toto/start  PROD instance_name1 -p
00 9 * * * /home/toto/start  PROD instance_name2 -p
15 8 * * * /home/toto/start  PROD instance_name3 -p

hostname2:
00 8 * * * /home/toto/start  PROD instance_name4 -p
10 8 * * * /home/toto/start  PROD instance_name5 -p

hostname3:
45 3 * * * /home/toto/start PROD instance_name7 -p

hostname4:
33 0 * * * /home/toto/start PROD instance_name6

file2
backuphostname:
30 10 * * * /home/toto/start  PROD instance_name1 -b
00 9 * * * /home/toto/start  PROD instance_name2 -b
15 8 * * * /home/toto/start  PROD instance_name3 -b

backuphostname2:
30 15 * * * /home/toto/start  PROD instance_name4 -b
00 10 * * * /home/toto/start  PROD instance_name5 -b

Now the thing is to match the instance name so i can know which host is backup of which host as far as instance_name. That seems to work ok with my code.

What i can't figure out through is the exception... instance_name7 doesn't have a backup in file2.

Code:
bash$ awk -f list.awk file1 file2
PROD: hostname2 instance_name4 -p BACKUP: backuphostname2 instance_name4 -b
PROD: hostname3 instance_name7 NO BACKUP

My code now:

Code:
BEGIN {
    FS="\n"
    RS=""
}
NR == FNR {
    for (i = 2; i <= NF; i++)
        split($i,prodsle," ")
            prod[$1]=prodsle[8]
            next
}
{
    for (i = 2; i <= NF; i++)
        split($i,backupsle," ")
            backup[$1]=backupsle[8]
            for ( x in prod )
                if ( backupsle[8] == prod[x] ) 
                    printf "%s %s %2s %s %s %s \n",x,prod[x],prodsle[9],$1,backupsle[8],backupsle[9]
}

END {
    print "Done"
}

# 2  
Old 11-11-2011
I don't understand the requirement ...
I suppose it will be easier if you post an example of the desired output and
explain how it differs from the one you're getting.
# 3  
Old 11-11-2011
Right now there is a print only if there is a match. Unmatched items are not output and the way the loop as been created if i just put an else i end up printing all unmatched hosts (not the desired behavior). Thats the part i'm stumbling on.

Code:
bash$ awk -f list.awk file1 file2
PROD: hostname2 instance_name4 -p BACKUP: backuphostname2 instance_name4 -b
PROD: hostname3 instance_name7 NO BACKUP

This is what i want. First line is a match with is backup and second line is an unmatched instance_name.

Right now i get only the first line part (for all matching instances). Not the unmatched ones.
# 4  
Old 11-11-2011
I think you are making things much more complicated than they need to be. Here's an example that prints instance name, production host, and backup host and also indicates if there is no backup host.

Code:
awk '
    NF < 2 { host = $1; next; } # snag host name from either file

    NR == FNR {             # capture host that each prod runs on
        prod[$8] = host;
        next;
    }

    {                       # capture host that each backup runs on
        back[$8] = host;
        next;
    }

    END {
        for( x in prod )
            printf( "%s production on %s backed up on %s\n", x, prod[x], back[x] == "" ? "NO BACKUP HOST" : back[x] );
    }
' file1 file2

Running it on your sample data yields this:
Code:
instance_name1 production on hostname: backed up on backuphostname:
instance_name2 production on hostname: backed up on backuphostname:
instance_name3 production on hostname: backed up on backuphostname:
instance_name4 production on hostname2: backed up on backuphostname2:
instance_name5 production on hostname2: backed up on backuphostname2:
instance_name6 production on hostname4: backed up on NO BACKUP HOST
instance_name7 production on hostname3: backed up on NO BACKUP HOST

May not be exactly what you want, but should give you an idea of how you can organise your code to give you both.

---------- Post updated at 22:14 ---------- Previous update was at 22:02 ----------

This will list organised by hostname in file 1:

Code:
awk '
    NF < 2 { host = $1; next; } # snag host name from either file
    NR == FNR { inst[host] = inst[host] $8 " "; next; }
    { back[$8] = host; next; }

    END {
        for( h in inst )
        {
            printf( "host: %s\n", h );
            n = split( inst[h], a, " " );
            for( i = 1; i <= n; i++ )
                printf( "\t%s %s\n", a[i], back[a[i]] == "" ? "NOT BACKED UP" : "backed up on " back[a[i]] );
            printf( "\n" );
        }
    }
' file1 file2

Output looks like this:
Code:
host: hostname:
       instance_name1 backed up on backuphostname:
       instance_name2 backed up on backuphostname:
       instance_name3 backed up on backuphostname:

host: hostname2:
       instance_name4 backed up on backuphostname2:
       instance_name5 backed up on backuphostname2:

host: hostname3:
       instance_name7 NOT BACKED UP

host: hostname4:
       instance_name6 NOT BACKED UP

This User Gave Thanks to agama For This Post:
# 5  
Old 11-11-2011
Your code doesn't seem to do what you claimed. I guess you wanted something like the following. It doesn't print exactly the format you wanted, but should be easy to adapt.

Code:
$ cat list.awk
BEGIN {
  RS = ""
  FS = "\n"
}
{
  for (i = 2; i <= NF; i++) {
    split($i, a, " ")
    if (NR == FNR)
      h[a[8]] = $1
    else 
      print a[7], $1, a[8], (a[8] in h)? h[a[8]] : "NO"
  }
}
END { print "Done" }

$ awk -f list.awk file2 file1

Didn't realize that agama has answered, but nevertheless ...
This User Gave Thanks to binlib For This Post:
# 6  
Old 11-15-2011
Thank your both for the code.... i took agama's code and tried to add some of my own to grab the options (the -p or -b) but i'm surely missing something.

Code:
NF < 2 { host = $1; next; } 

NR == FNR {             
        prd_option[ prod[$8] = host ] = $9;
        next;
}

{                       
        bck_option[ back[$8] = host ] = $9;
        next;
}

    END {
        for( x in prod )
            printf( "%s, %s, %s, %s, %s\n", x, prod[x], prd_option[x], back[x] == "" ? "NO BACKUP HOST" : back[x], bck_option[x] );
}

In first loop: since x in prod = instance_name1, my toughts where that prd_option[instance_name1] would equal -p
In last printf: again since i have two arrays (one prod, one backup) i would get something in prd_option[instance_name1] that would equal my -p or -b or nothing if its empty.

I'm surely missing something OR i got this all wrong.... Thanks.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to combine all matching dates and remove non-matching

Using the awk below I am able to combine all the matching dates in $1, but I can not seem to remove the non-matching from the file. Thank you :). file 20161109104500.0+0000,x,5631 20161109104500.0+0000,y,2 20161109104500.0+0000,z,2 20161109104500.0+0000,a,4117... (3 Replies)
Discussion started by: cmccabe
3 Replies

2. Shell Programming and Scripting

Compare file1 for matching line in file2 and print the difference in matching lines

Hello, I have two files file 1 and file 2 each having result of a query on certain database tables and need to compare for Col1 in file1 with Col3 in file2, compare Col2 with Col4 and output the value of Col1 from File1 which is a) not present in Col3 of File2 b) value of Col2 is different from... (2 Replies)
Discussion started by: RasB15
2 Replies

3. Shell Programming and Scripting

Insert lines above matching line with content from matching

Hi, I have text file: Name: xyz Gender: M Address: "120_B_C; ksilskdj; lsudlfw" Zip: 20392 Name: KLM Gender: F Address: "65_D_F; wnmlsi;lsuod;,...." Zip:90233I want to insert 2 new lines before the 'Address: ' line deriving value from this Address line value The Address value in quotes... (1 Reply)
Discussion started by: ysrini
1 Replies

4. Shell Programming and Scripting

Matching two files with awk

Hello, I have two files as follow: AFFY_ID RS_ID CHROMOSOME POS_START POS_END ALLELE1 ALLELE2 SNP_A-1780283 rs17011450 chr4 127630275 127630276 C T SNP_A-1780285 rs6919430 chr6 90919464 90919465 A C SNP_A-1780286 --- chr7 104281409 104281410 A G SNP_A-1780301 rs2342723 chr16 5748790... (1 Reply)
Discussion started by: Homa
1 Replies

5. Shell Programming and Scripting

String matching using awk

Hello, I am working with google ngram data set which is of size 100s of gb. Before using it with Java, I wanted to filter it out using shell script. Here is a sample line in the file: 2.55 1.57 1992 10 20 30 The first two fields (2.55 and 1.57) are... (3 Replies)
Discussion started by: shekhar2010us
3 Replies

6. Shell Programming and Scripting

awk multiline matching

I have a file that looks something like this with lots of text before and after. Distance method: Sum of squared size difference (RST) </data> <pairwiseDifferenceMatrix time="02/08/11 at 13:08:27"> 1 2 1 448.82151 507.94231 2 ... (7 Replies)
Discussion started by: mgray
7 Replies

7. Shell Programming and Scripting

AWK help, matching 2 files into one

I'm newbie with AWK. What I'm trying to do is matching file1 and file2 into a file3 with records listed in columns with pipe as delimiter. The thing is the file1 has thousands of records while file2 has very few. But I want the file3 to show all records in file1 and with data from file2 to be... (2 Replies)
Discussion started by: jmeasel7
2 Replies

8. UNIX for Dummies Questions & Answers

awk - pattern matching?

Hello all, I am trying to sort thru a database and print all the customers whose first names are only four characters. I just want to pull the first name only from the database. the database records appear like this in file: Mike Harrington:(510) 548-1278:250:100:175; first is name Mike... (4 Replies)
Discussion started by: citizencro
4 Replies

9. Shell Programming and Scripting

pattern matching using awk.

Dear Team, How do we match two patterns on the same line using awk?Are there any logical operators which i could use in awk like awk '\gokul && chennai\' <filename> Eg: Input file: gokul,10/11/1986,coimbatore. gokul,10/11/1986,bangalore. gokul,12/04/2008,chennai.... (2 Replies)
Discussion started by: gokulj
2 Replies

10. Shell Programming and Scripting

AWK pattern matching, first and last

In a nutshell, I need to work out how to return the last matching pattern from an awk //,// search. I can bring back the first, but am unsure how to obtain the last, and a simple tail won't work as the match could be over multiple lines. Secondly I would like some way of pattern matching, a... (10 Replies)
Discussion started by: smb_uk
10 Replies
Login or Register to Ask a Question