Visit Our UNIX and Linux User Community


awk to filter multiple lines


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk to filter multiple lines
# 1  
Old 10-23-2013
awk to filter multiple lines

Hi.
I need to filter lines based upon matches in multiple tab-separated columns. For all matching occurrences in column 1, check the corresponding column 4. IF all column 4 entries are identical, discard all lines. If even one entry in column 4 is different, then keep all lines.

How can I modify the following
HTML Code:
 awk
to compare the 4th column and not the 2nd column:

Code:
FNR==NR {
    array[$0]++
    next
}

{
    counter = 0
    for (i in array) {
        split(i, holder, FS)
        if (holder[1] == $4) {
            counter++
        }
    }
    if (counter >= 2) {
        print
    }
}

Code:
  $ awk -f script.awk file.txt{,}

The input data is the following:

Code:
DOG A B BIG 
DOG C D BIG 
DOG E F BIG 
CAT G H SMALL 
CAT I J SMALL 
CAT K L BIG 
CAT M N SMALL

The desired output is the following:

Code:
CAT G H SMALL
CAT I J SMALL 
CAT K L BIG 
CAT M N SMALL


Last edited by owwow14; 10-23-2013 at 08:50 AM.. Reason: improved formatting
# 2  
Old 10-23-2013
Try

Code:
$awk 'NR==FNR{if(A[$1]!=$NF && A[$1]){B[$1]++}A[$1]=$NF;next}{if(B[$1]){print }}' file file

CAT G H SMALL
CAT I J SMALL
CAT K L BIG
CAT M N SMALL

# 3  
Old 10-23-2013
Hi Pamu,
I have been trying you suggestion and it does not work.
It does not output anything.

One question: why do you have "file" "file".
Shouldnt it be
Code:
"file_input" > "file_output"

.

I ask because I am only considering 1 file and perhaps this is the reason for the error?
# 4  
Old 10-23-2013
No I have given same file as input two times that's why it is file file and not file > file

If you want to redirect your output to any other file then do like this file file > file_out
This User Gave Thanks to pamu For This Post:
# 5  
Old 10-23-2013
Thanks pamu,
I misunderstood that the file was taken twice as input.
works great!
# 6  
Old 10-23-2013
try also:
Code:
 
awk '!a[$1]++ { if (p && s) printf s; p=0; s=""; }
{if (!a[$1,$4]++) p=1 ; if (!s) p=0; s=s $0 "\n"}
END {if (p && s) printf s}
' input


Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

awk multiple lines

When your data is consistent it is easy to use awk with multiple lines like this. Can we please make this portable so I can use this in both RHEL and AIX? awk '{RS="/directory1" } $7 ~ /drwxr-xr-x/ {print $1 " " $7}' file What do I do when the data is not consistent? When your data is not... (2 Replies)
Discussion started by: cokedude
2 Replies

2. Shell Programming and Scripting

Merging multiple lines to columns with awk, while inserting commas for missing lines

Hello all, I have a large csv file where there are four types of rows I need to merge into one row per person, where there is a column for each possible code / type of row, even if that code/row isn't there for that person. In the csv, a person may be listed from one to four times... (9 Replies)
Discussion started by: RalphNY
9 Replies

3. Shell Programming and Scripting

awk to filter out lines containing unique values in a specified column

Hi, I have multiple files that each contain four columns of strings: File1: Code: 123 abc gfh 273 456 ddff jfh 837 789 ghi u4u 395 File2: Code: 123 abc dd fu 456 def 457 nd 891 384 djh 783 I want to compare the strings in Column 1 of File 1 with each other file and Print in... (3 Replies)
Discussion started by: owwow14
3 Replies

4. UNIX for Dummies Questions & Answers

awk loop for to filter lines by max value

Hi all, I'm struggling to filter my data frame. I need to print only those lines whose max value (the number of columns may vary) is above a cut-off value. My data looks like this: chr22 17565753 17565754 5 4 5 5 6 2 5 5 6 2 chr22 17565754 ... (2 Replies)
Discussion started by: lsantome
2 Replies

5. Shell Programming and Scripting

awk : Filter a set of data to parse header line and last field of multiple same match.

Hi Experts, I have a data with multiple entry , I want to filter PKG= & the last column "00060110" or "00088150" in the output file: ############################################################################################### PKG= P8SDB :: VGS = vgP8SOra vgP8SDB1 vgP8S001... (5 Replies)
Discussion started by: rveri
5 Replies

6. Shell Programming and Scripting

awk Help: Filter Multiple Entry & print in one line.

AWK Gurus, data: srvhcm01 AZSCI srvhcm01 AZSDB srvhcm01 BZSDB srvhcm01 E2QDI31 srvhcm01 YPDCI srvhcm01 YPDDB srvhcm01 UV2FSCR srvhcm01 UV2FSBI srvhcm01 UV2FSXI srvhcm01 UV2FSUC srvhcm01 UV2FSEP srvhcm01 UV2FSRE srvhcm01 NASCI srvhcm01 NASDB srvhcm01 UV2FSSL srvhcm01 UV2FSDI (7 Replies)
Discussion started by: rveri
7 Replies

7. Shell Programming and Scripting

Awk match multiple columns in multiple lines in single file

Hi, Input 7488 7389 chr1.fa chr1.fa 3546 9887 chr5.fa chr9.fa 7387 7898 chrX.fa chr3.fa 7488 7389 chr21.fa chr3.fa 7488 7389 chr1.fa chr1.fa 3546 9887 chr9.fa chr5.fa 7898 7387 chrX.fa chr3.fa Desired Output 7488 7389 chr1.fa chr1.fa 2 3546 9887 chr5.fa chr9.fa 2... (2 Replies)
Discussion started by: jacobs.smith
2 Replies

8. Shell Programming and Scripting

filter out a sequence from multiple lines line

Hi, I have an unwanted string at random lines of my verilog (*.v) file. (* abccddee *) input A; (* xyz *) input B; (* 1234 *) output C; I want a clean file like this: input A; input B; output C; the unwanted string begins with "(*" and ends with "*)" at multiple lines. Any help... (2 Replies)
Discussion started by: return_user
2 Replies

9. Shell Programming and Scripting

Awk to Break lines to multiple lines.

Input File: nawk -F "|" '{ for(i=1;i<=NF;i++) { if (i == 2) {gsub(",","#",$i);z=split($i,a,"")} else if (i == 3) {gsub(",","#",$i);z=split($i,b,"")} } if(z > 0) for(i=1;i<=z;i++) print $1,a,"Test"; if(w > 0) for(j=1;j<=w;j++) ... (1 Reply)
Discussion started by: pinnacle
1 Replies

10. Shell Programming and Scripting

need help--script to filter specific lines from multiple txt files

Hi folks, - I have 800 txt files - those files are cisco router configs router1.txt router2.txt ... router800.txt I want to accomplish the following: - I want to have a seperate file with all the filenames that I want to process - I want a script that goes trough all those... (7 Replies)
Discussion started by: I-1
7 Replies

Featured Tech Videos