awk to grep rows by multiple fields


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk to grep rows by multiple fields
# 1  
Old 05-01-2012
awk to grep rows by multiple fields

Hello,
I met a challenge to extract part of the table. I'd like to grep the first three matches based on field1 and field2. Input:
Code:
D A 92.85   1315    83      11     
D A 95.90   757     28      3      
D A 94.38   480     20      7      
D A 91.21   307     21      6      
D A 94.26   244     14      0      
D A 93.66   142     9       0      
D B 91.82   1321    92      16     
D B 94.85   757     30      4      
D B 94.17   480     22      6      
D B 90.79   304     26      2      
D B 93.39   242     16      0      
D B 90.97   144     11      2      
D C 89.86   1321    119     15

Output
Code:
D A 92.85   1315    83      11     
D B 91.82   1321    92      16     
D C 89.86   1321    119     15

Similar question about picking up the first match was posted before, but this one is more brain-twisting, and I feel there must be a simple script to do the job. Thanks a lot!
# 2  
Old 05-01-2012
Standard awk unique program
Code:
$ awk '!a[$1,$2]++' input
D A 92.85   1315    83      11
D B 91.82   1321    92      16
D C 89.86   1321    119     15

You also want limited to just 3 matches: awk '!a[$1,$2]++&&++m;m==3{exit}' input
These 2 Users Gave Thanks to neutronscott For This Post:
# 3  
Old 05-01-2012
In two steps, and I suppose could be simplified

Code:
$ cut -d" " -f1,2 <sample6.txt | sort -u > sample6a.txt

$ while read line; do cat sample6.txt|grep "$line"|head -1 ; done <sample6a.txt
D A 92.85   1315    83      11
D B 91.82   1321    92      16
D C 89.86   1321    119     15

The first command gets the unique matching patterns.
The second grabs the first line for each 'matching patterns'.
This User Gave Thanks to joeyg For This Post:
# 4  
Old 05-01-2012
Or:
Code:
nl infile | sort -k2,3 -k1 | sort -u -k2,3 | cut -f2- | head -n3

GNU sort:
Code:
sort -u -k1,2 infile | head -n3

This User Gave Thanks to Scrutinizer For This Post:
# 5  
Old 05-01-2012
Thanks Scott! I tried your first way myself, but thought it was wrong. Your second way is what I was struggling with. Awesome!
Thank you Joey and Scrutinizer! Widen my idea to solve the problem in different way. I felt embarassed when I saw Scrutinizer's script. Lots to learn and the tool is there. Thank you guys again!

Last edited by yifangt; 05-01-2012 at 04:15 PM..
# 6  
Old 05-02-2012
Quote:
Originally Posted by neutronscott
Standard awk unique program
Code:
$ awk '!a[$1,$2]++' input
D A 92.85   1315    83      11
D B 91.82   1321    92      16
D C 89.86   1321    119     15

You also want limited to just 3 matches: awk '!a[$1,$2]++&&++m;m==3{exit}' input
can you explain a[$1,$2]++ mean, I never see this usage Smilie. I don't know I should google what to find its explanation
# 7  
Old 05-02-2012
This will create element in array a, and increment it's value. So if element is already there, the test fails. First line makes a[D,A]=1 so next time there is a "D A", !a[D,A] is false, and is skipped.

I do not know better way to google than awk unique. Is popular method to use awk '!a[$0]++' for unique lines. Is different here in that we only use column 1 and 2.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Print multiple fields with awk

so its common knowledge one can print multiple fields with simple commands like this: echo 12 44 45 552 24 | awk '{print $1,$4,$3}' but suppose i want to avoid specifying the "$" symbol. is that possible? can something like this be done: echo 12 44 45 552 24 | awk '{print $(1,4,3)}' ... (9 Replies)
Discussion started by: SkySmart
9 Replies

2. Shell Programming and Scripting

Replace 0 with 1 in multiple fields with awk

Hello, I have the following input file: 1 3 3 2 3 3 4 0 4 0 5 4 5 2 2 0 5 3 4 0 6 0 3 2 I am trying to remove all zeroes in fields 2 and 4 and replace them with "1's" I tried the following, but it's not working awk -F"\t" '{ if (($2==0) || ($4==0) $2=1; $4=1; print $0 ) }' input ... (8 Replies)
Discussion started by: Rabu
8 Replies

3. Shell Programming and Scripting

awk multiple fields separators

Can you please help me with this .... Input File share "FTPTransfer" "/v31_fs01/root/FTP-Transfer" umask=022 maxusr=4294967295 netbios=NJ09FIL530 share "Test" "/v31_fs01/root/Test" umask=022 maxusr=4294967295 netbios=NJ09FIL530 share "ENR California" "/v31_fs01/root/ENR California"... (14 Replies)
Discussion started by: greycells
14 Replies

4. Shell Programming and Scripting

awk gsub multiple fields

Hi, I am trying to execute this line awk -F ";" -v OFS=";" '{gsub(/\./,",",$6); print}' FILE but for multiple fields $6 $7 $8 Do you have a suggstion? Tried: awk -F ";" -v OFS="";"" "function GSUB( F ) {gsub(/\./,\",\",$F); print} { GSUB( 6 ); GSUB( 7 ); GSUB( 8 ) } 1"... (2 Replies)
Discussion started by: nakaedu
2 Replies

5. Shell Programming and Scripting

Loop through multiple rows using awk

Hi, i'm been browsing through the threads on how to do looping of multiple lines in awk but havent found the one i needed. I have a data which looks like this below. I need to compute for the monthly average of values per record and i used the awk argument below. how do i tell awk to execute the... (9 Replies)
Discussion started by: ida1215
9 Replies

6. Shell Programming and Scripting

Remove rows with first 4 fields duplicated in awk

Hi, I am trying to use awk to remove all rows where the first 4 fields are duplicates. e.g. in the following data lines 6-9 would be removed, leaving one copy of the duplicated row (row 5) Borgarhraun FH9822 ol24 FH9822_ol24_m20 ol Deformed c Borgarhraun FH9822 ol24 ... (3 Replies)
Discussion started by: tomahawk
3 Replies

7. UNIX for Dummies Questions & Answers

Formatting Multiple fields on 1 line to multiple rows

I'm trying extract a number of filename fields from a log file and copy them out as separate rows in a text file so i can load them into a table. I'm able to get the filenames but the all appear on one line. I tried using the cut command with the -d (delimiter) option but cant seem to make it... (1 Reply)
Discussion started by: Sinbad-66
1 Replies

8. Shell Programming and Scripting

AWK multiple fields separators

I need to print the second field of a file, taking spaces, tab and = as field separators. ; for 16-bit app support MAPI=1 CMC=1 CMCDLLNAME32=mapi32.dll CMCDLLNAME=mapi.dll MAPIX=1 MAPIXVER=1.0.0.1 OLEMessaging=1 asf=MPEGVideo asx=MPEGVideo ivf=MPEGVideo m3u=MPEGVideo (2 Replies)
Discussion started by: PamPam
2 Replies

9. Shell Programming and Scripting

grep multiple rows from file.

Hi, I have file1 that contains many columns as show the first three below: "At1g29930" 198 2105 "At5g46430" 5569 9576 "At1g64740" 1908 2505 "At5g46430" 6717 11317 "At1g64740" 453 655 "At1g12470" 33 18 "At1g80680" 149 262 "At1g23040" ... (3 Replies)
Discussion started by: yifangt
3 Replies

10. Shell Programming and Scripting

extract multiple cloumns from multiple files; skip rows and include filenames; awk

Hello, I am trying to write a bash shell script that does the following: 1.Finds all *.txt files within my directory of interest 2. reads each of the files (25 files) one by one (tab-delimited format and have the same data format) 3. skips the first 10 rows of the file 4. extracts and... (4 Replies)
Discussion started by: manishabh
4 Replies
Login or Register to Ask a Question