Filtering lines for column elements based on corresponding counts in another column


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Filtering lines for column elements based on corresponding counts in another column
# 1  
Old 03-05-2012
Filtering lines for column elements based on corresponding counts in another column

Hi,

I have a file like this
PHP Code:
ACC 2 2 21 aaa 
AC 443 3 22 aaa  
GCT 76 1 33 xxx 
TCG 34 2 33 aaa 
ACGT 33 1 22  ggg 
TTC 99 3 44 wee 
CCA 33 2 33 ggg 
AAC 1 3 55 ddd 
TTG 10 1 22 ddd 
TTGC 98 3 22 ddd 
GCT 23 1 21 sds 
GTC 23 4 32 sds
ACGT 32 2 33 vvv 
CGT 11 2 33 eee 
CCC 87 2 44 eee 
As you can see column5 has repetitive elements. I want to print the lines with highest column2 values for each repetitive element in column5.

If there are more than 1 maximum value in column2, print the line with first occurrence of column5 value with maximum column2 value

So, the desired output
PHP Code:
AC 443 3 22 aaa  
GCT 76 1 33 xxx 
ACGT 33 1 22  ggg 
TTC 99 3 44 wee 
TTGC 98 3 22 ddd 
GCT 23 1 21 sds 
ACGT 32 2 33 vvv 
CCC 87 2 44 eee 
thanks in advanceSmilieSmilie
# 2  
Old 03-05-2012
Try:
Code:
awk '$2>M[$5]{M[$5]=$2;a[$5]=$0}END{for (i in a) print a[i]}' file

This User Gave Thanks to bartus11 For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Add column and multiply its result to all elements of another column

Input file is as follows: 1 | 6 2 | 7 3 | 8 4 | 9 5 | 10 Output reuired (sum of the first column $1*$2) 1 | 6 | 90 2 | 7 | 105 3 | 8 | 120 4 |9 | 135 5 |10 | 150 Please enclose sample input, sample output, and code... (5 Replies)
Discussion started by: Sagar Singh
5 Replies

2. UNIX for Beginners Questions & Answers

Filtering records of a csv file based on a value of a column

Hi, I tried filtering the records in a csv file using "awk" command listed below. awk -F"~" '$4 ~ /Active/{print }' inputfile > outputfile The output always has all the entries. The same command worked for different users from one of the forum links. content of file I was... (3 Replies)
Discussion started by: sunilmudikonda
3 Replies

3. UNIX for Beginners Questions & Answers

Filtering based on column values

Hi there, I am trying to filter a big file with several columns using values on a column with values like (AC=5;AN=10;SF=341,377,517,643,662;VRT=1). I wont to filter the data based on SF= values that are (bigger than 400) ... (25 Replies)
Discussion started by: daashti
25 Replies

4. UNIX for Beginners Questions & Answers

Selecting lines based on the value in the 3rd column.

Hello, I have a sample data like this: A1 B1 100.00 B1 A1 100.00 A2 B2 90.80 B2 A2 90.80 A3 B3 99.07 B3 A3 99.07 A4 B4 99.00 B4 A4 99.00 A5 B5 97.13 B5 A5 99.53 . . Ax By i By Ax j each two lines are same comparison with opposite order. What I expected is... (3 Replies)
Discussion started by: nengcheng
3 Replies

5. UNIX for Dummies Questions & Answers

Merging lines based on one column

Hi, I have a file which I'd like to merge lines based on duplicates in one column while keeping the info for other columns. Let me simplify it by an example: File ESR1 ANASTROZOLE NA FDA_approved ESR1 CISPLATIN NA FDA_approved ESR1 DANAZOL agonist NA ESR1 EXEMESTANE NA FDA_approved... (3 Replies)
Discussion started by: JJ001
3 Replies

6. Shell Programming and Scripting

awk Print New Column For Every Two Lines and Match On Multiple Column Values to print another column

Hi, My input files is like this axis1 0 1 10 axis2 0 1 5 axis1 1 2 -4 axis2 2 3 -3 axis1 3 4 5 axis2 3 4 -1 axis1 4 5 -6 axis2 4 5 1 Now, these are my following tasks 1. Print a first column for every two rows that has the same value followed by a string. 2. Match on the... (3 Replies)
Discussion started by: jacobs.smith
3 Replies

7. Shell Programming and Scripting

Filtering first file columns based on second file column

Hi friends, I have one file like below. (.csv type) SNo,data1,data2 1,1,2 2,2,3 3,3,2 and another file like below. Exclude data1 where Exclude should be treated as column name in file2. I want the output shown below. SNo,data2 1,2 2,3 3,2 Where my data1 column got removed from... (2 Replies)
Discussion started by: ks_reddy
2 Replies

8. Shell Programming and Scripting

Remove lines based on column value

Hi All, I just need a quick fix here. I need to delete all lines containing "." in the 6th column. Input: 1 1055498 . G T 5.46 . 1 1902377 . C T 7.80 . 1 1031540 . A G 34.01 PASS 1 ... (2 Replies)
Discussion started by: Hkins552
2 Replies

9. Shell Programming and Scripting

Perl: filtering lines based on duplicate values in a column

Hi I have a file like this. I need to eliminate lines with first column having the same value 10 times. 13 18 1 + chromosome 1, 122638287 AGAGTATGGTCGCGGTTG 13 18 1 + chromosome 1, 128904080 AGAGTATGGTCGCGGTTG 13 18 1 - chromosome 14, 13627938 CAACCGCGACCATACTCT 13 18 1 + chromosome 1,... (5 Replies)
Discussion started by: polsum
5 Replies

10. UNIX for Dummies Questions & Answers

Filtering records of a file based on a value of a column

Hi all, I would like to extract records of a file based on a condition. The file contains 47 fields, and I would like to extract only those records that match a certain value in one of the columns, e.g. COL1 COL2 COL3 ............... COL47 1 XX 45 ... (4 Replies)
Discussion started by: risk_sly
4 Replies
Login or Register to Ask a Question