Filter Row Based On Max Column Value After Group BY


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Filter Row Based On Max Column Value After Group BY
# 1  
Old 08-07-2016
Blade Filter Row Based On Max Column Value After Group BY

Hello Team,

Need your expertise on following:

Here is the set of data:

HTML Code:
C1|4|C1SP1|A1|C1BP1|T1
C1|4|C1SP2|A1|C1BP2|T2
C2|3|C2SP1|A2|C2BP1|T2
C3|3|C3SP1|A3|C3BP1|T2
C2|2|C2SP2|A2|C2BP2|T1
I need to filter above date base on following two steps:

1. Group them by column 1 and 4
2. Once grouped, print the row where column 6 is maxium

For example:
In column 1 we have C1, C2 and C3. There are two rows where column 1 = C1 and column 4 = A1.
Out of above two columns , second row should be printed because Column 6 has greater value T2.

Following will be the output:

HTML Code:
C1|4|C1SP2|A1|C1BP2|T2
C2|3|C2SP1|A2|C2BP1|T2
C3|3|C3SP1|A3|C3BP1|T2
Note - 1 < 2 which meams T1 < T2 and so on.


Thanks
Angsuman

Last edited by angshuman; 08-07-2016 at 10:27 AM.. Reason: Typo
# 2  
Old 08-07-2016
Hello angshuman,

Could you please try following and let me know if this helps.
Code:
awk -F"|" FNR==NR'{A[$1,$4]=A[$1,$4]>$2?A[$1,$2]:$2;next}  (($1,$4) in A){print;delete A[$1,$4]}'  Input_file  Input_file

Output will be as follows.
Code:
C1|4|C1SP1|A1|C1BP1|T1
C2|3|C2SP1|A2|C2BP1|T2
C3|3|C3SP1|A3|C3BP1|T2

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
# 3  
Old 08-07-2016
Hello Ravinder,

Thank you for your help. In your code, you have used $2. Are you referring to field 2 by $2? If yes, I think it should be $6. Because I wish to find the row where $6 is greater after grouping them based on $1 and $4.

However, it did not give the output using your code.

Thanks
Angsuman
# 4  
Old 08-07-2016
Here's my try:

Code:
 awk -F'|' '{if(d[$1$4"_2"]<$6){d[$1$4"_1"]=$0;d[$1$4"_2"]=$6}} END{ for(v in d) {if(substr(v,length(v))==1) print d[v]}}' input_file1 ...


Last edited by stomp; 08-07-2016 at 11:50 AM..
# 5  
Old 08-07-2016
Quote:
Originally Posted by angshuman
Hello Ravinder,

Thank you for your help. In your code, you have used $2. Are you referring to field 2 by $2? If yes, I think it should be $6. Because I wish to find the row where $6 is greater after grouping them based on $1 and $4.
However, it did not give the output using your code.
Thanks
Angsuman
Hello angshuman,

Could you please try following and let me know if this helps you.
Code:
awk -F"|" 'FNR==NR{sub(/[[:alpha:]]/,X,$NF);A[$1,$4]=A[$1,$4]>$NF+0?A[$1,$4]:$NF+0;next} {Q=$NF;sub(/[[:alpha:]]/,X,Q);if(A[$1,$4]==(Q+0)){print}}'  Input_file  Input_file

Output will be as follows.
Code:
C1|4|C1SP2|A1|C1BP2|T2
C2|3|C2SP1|A2|C2BP1|T2
C3|3|C3SP1|A3|C3BP1|T2

Thanks,
R. Singh

Last edited by RavinderSingh13; 08-07-2016 at 12:12 PM.. Reason: added 0 to $NF's value.
# 6  
Old 08-07-2016
Hello Ravinder,

Sorry this time also it did not work. I am using enterprise Linux and awk version is GNU awk 3.1.5

Hello stomp,

Your try worked for me. I need to understand the code that you have given. Can you please guide me on this? What is the purpose of "_2" and "_1"

Thanks
Angsuman
# 7  
Old 08-07-2016
Code:
awk -F'|' '{if(d[$1$4"_2"]<$6){d[$1$4"_1"]=$0;d[$1$4"_2"]=$6}} END{ for(v in d) {if(substr(v,length(v))==1) print d[v]}}' input_file1 ...

d[..._1] contains the whole line.
d[..._2] contains the value for the comparison.

only the ..._1 value should be printed at the end.

Last edited by stomp; 08-07-2016 at 12:00 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Print a row with the max number in a column

Hello, I have this table: chr1_16857_17742 - chr1 17369 17436 "ENST00000619216.1"; "MIR6859-1"; - 67 chr1_16857_17742 - chr1 14404 29570 "ENST00000488147.1"; "WASH7P"; - 885 chr1_16857_18061 - chr1 ... (5 Replies)
Discussion started by: coppuca
5 Replies

2. Shell Programming and Scripting

Filter tab file based on column value

Hello I have a tab text file with many columns and have to filter rows ONLY if column 22 has the value of '0', '1', '2' or '3' (out of 0-5). If Column 22 has value '0','1', '2' or '3' (highlighted below), then remove anything less than 10 and greater 100 (based on column 5) AND remove anything... (1 Reply)
Discussion started by: nans
1 Replies

3. Shell Programming and Scripting

awk filter based on column value (variable value)

Hi, I have a requirement to display/write the 3rd column from a file based on the value in the column 3. Ex: Data in the File (comma delimited) ID,Value,Description 1,A,Active 1,I,Inactive 2,S,Started 1,N,None 2,C,Completed 2,F,Failed I need to first get a list of all Unique IDs in... (7 Replies)
Discussion started by: kiranredz
7 Replies

4. Shell Programming and Scripting

extracting row with max column value using awk or unix

Hello, BC106081_abc_128240811_128241377 7.96301 BC106081_abc_128240811_128241377 39.322 BC106081_cde_128240811_128241377 1.98628 BC106081_def_128240811_128241377 -2.44492 BC106081_abc_128240811_128241377 69.5504 FLJ00075_xyz_14406_16765 -0.173417 ... (3 Replies)
Discussion started by: Diya123
3 Replies

5. Shell Programming and Scripting

Filter and migrate data from row to column

Hello Experts, I am new in scripting. I would like to filter and migrate data from row to column by awk. Thanks in advance. For example FileA abc 1 2 3 Xyz3 4 1 5 bcd1 Output : Abc 1 2 3 Xyz3 4 1 5 bcd1 3 5 6 (5 Replies)
Discussion started by: shah09
5 Replies

6. Shell Programming and Scripting

Filter the column and print the result based on condition

Hi all This is my output of the some SQL Query TABLESPACE_NAME FILE_NAME TOTALSPACE FREESPACE USEDSPACE Free ------------------------- ------------------------------------------------------- ---------- --------- ---------... (2 Replies)
Discussion started by: jhon
2 Replies

7. Shell Programming and Scripting

repeated column data filter and make as a row

I need to get the output in row wise for the repeated column data Ex: Input: que = five ans = 5 que = six ans = 6 Required output: que = five six ans = 5 6 Any body can guide me?"""""" (2 Replies)
Discussion started by: vasanth_vadalur
2 Replies

8. Shell Programming and Scripting

Insert comma based on max number of column

Hi, I am new to unix shell shell scripting. I have a specific requirement where I need to append comma's based on the max number of column in the file. Eg: If my source file look something like this, sengwa,china tom,america,northamerica smith,america walter My output file... (8 Replies)
Discussion started by: nicholas_ejn
8 Replies

9. Shell Programming and Scripting

intent: df -kh | filter based on capacity (used space) column where % > 85

I want to accomplish this in sh, however if the capability exists only in other shells elsewhere that's acceptable. % df -kh Filesystem size used avail capacity Mounted on ... /dev/dsk/c0t0d0s1 103G 102G 23M 100% /export/DISK15 ... # output... (5 Replies)
Discussion started by: ProGrammar
5 Replies

10. Shell Programming and Scripting

filter based on column value

I have a file with colon separated values.. the sample is attached below. No of fields in each record/line is dependent on the value of field53. What I need to do is to design a special filter based on specific requirement of some match of values in particular column or combination of columns. ... (2 Replies)
Discussion started by: rraajjiibb
2 Replies
Login or Register to Ask a Question