Filter Row Based On Max Column Value After Group BY


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Filter Row Based On Max Column Value After Group BY
# 8  
Old 08-07-2016
Quote:
Originally Posted by angshuman
Hello Ravinder,
Sorry this time also it did not work. I am using enterprise Linux and awk version is GNU awk 3.1.5
Thanks
Angsuman
Hello angshuman,

It is working fine for me. I tried it with following Input_file.
Code:
cat  Input_file
C1|4|C1SP1|A1|C1BP1|T1
C1|4|C1SP2|A1|C1BP2|T2
C1|4|C1SP1|A1|C1BP1|T8
C1|4|C1SP2|A1|C1BP2|T3
C2|3|C2SP1|A2|C2BP1|T2
C3|3|C3SP1|A3|C3BP1|T2
C2|2|C2SP2|A2|C2BP2|T1
C2|4|C1SP1|A3|C1BP1|T11
C2|4|C1SP2|A3|C1BP2|T21

Then running following command with above Input_file, I got the output as per your request.
Code:
awk -F"|" 'FNR==NR{sub(/[[:alpha:]]/,X,$NF);A[$1,$4]=A[$1,$4]>$NF+0?A[$1,$4]:$NF+0;next} {Q=$NF;sub(/[[:alpha:]]/,X,Q);if(A[$1,$4]==(Q+0)){print}}'  Input_file  Input_file

Output will be as follows.
Code:
C1|4|C1SP1|A1|C1BP1|T8
C2|3|C2SP1|A2|C2BP1|T2
C3|3|C3SP1|A3|C3BP1|T2
C2|4|C1SP2|A3|C1BP2|T21

Let me know if you have any queries on same.

Thanks,
R. Singh

Last edited by RavinderSingh13; 08-07-2016 at 12:13 PM..
# 9  
Old 08-07-2016
Also try:
Code:
awk -F \| '$6>M[$1,$4]{M[$1,$4]=$6; R[$1,$4]=$0} END{for(i in R) print R[i]}'  file

Please note that the comparison of $6 is a string comparison, not a numerical comparison. It is not clear what is regarded to be the "greater value"
This User Gave Thanks to Scrutinizer For This Post:
# 10  
Old 08-07-2016
Ravinders solutions works for me if I I exactly execute it as he stated it.

Especially:

Code:
awk .... Input_file Input_file

The Input_file has to be specified twice.

@Scrutinizer: Thanks for optimizing Smilie
# 11  
Old 08-07-2016
Thank you Stomp. Now here is my understanding. Please correct me if wrong:

1. awk scans each row of the file.
2. It stores entire row in d[$1$4"_1" and column to compare in d[$1$4"_2"
3. After comparing it prints the row where column 6 value is greater.

Few more questions:

1. What is the purpose of substr(v,length(v))==1). Does this check for the value _1?
2. Does awk scan and store the entire file first and then starts comapring?
3. How does awk know that column 1 and column 4 need to be identical and compare value of column 6 between 1st and 2nd row or so on?

These questions may appear very naive. I am not very familiar with array in awk. Sorry to trouble you on this.

Thanks
Angsuman
# 12  
Old 08-07-2016
Hi Angshuman,

take Scrutinizers solution. It's same as mine, but simplified. The process is that way:

  1. Read input file line by line
  2. if stored group($1$4) is found and comparison-value is lower or not-existing then store the line in array / replace existing line
  3. at the end print all stored lines in array
# 13  
Old 08-17-2016
What if file contains data like below? I am sorry if I understood requirement wrong

Code:
[akshay@localhost tmp]$ cat file
C1|4|C1SP1|A1|C1BP1|T1
C1|4|C1SP2|A1|C1BP2|T2
C2|3|C2SP1|A2|C2BP1|T2
C3|3|C3SP1|A3|C3BP1|T2
C2|2|C2SP2|A2|C2BP2|T1111

Try if output order doesn't matter

Code:
[akshay@localhost tmp]$ awk -F\| '{g=$1 FS $4; v=$6; gsub(/[^0-9]*/,"",v)}(g in A && A[g]<v)||!(g in A){A[g]=v; B[g]=$0}END{for(i in B)print B[i]}' file
C3|3|C3SP1|A3|C3BP1|T2
C1|4|C1SP2|A1|C1BP2|T2
C2|2|C2SP2|A2|C2BP2|T1111


Last edited by Akshay Hegde; 08-17-2016 at 08:22 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Print a row with the max number in a column

Hello, I have this table: chr1_16857_17742 - chr1 17369 17436 "ENST00000619216.1"; "MIR6859-1"; - 67 chr1_16857_17742 - chr1 14404 29570 "ENST00000488147.1"; "WASH7P"; - 885 chr1_16857_18061 - chr1 ... (5 Replies)
Discussion started by: coppuca
5 Replies

2. Shell Programming and Scripting

Filter tab file based on column value

Hello I have a tab text file with many columns and have to filter rows ONLY if column 22 has the value of '0', '1', '2' or '3' (out of 0-5). If Column 22 has value '0','1', '2' or '3' (highlighted below), then remove anything less than 10 and greater 100 (based on column 5) AND remove anything... (1 Reply)
Discussion started by: nans
1 Replies

3. Shell Programming and Scripting

awk filter based on column value (variable value)

Hi, I have a requirement to display/write the 3rd column from a file based on the value in the column 3. Ex: Data in the File (comma delimited) ID,Value,Description 1,A,Active 1,I,Inactive 2,S,Started 1,N,None 2,C,Completed 2,F,Failed I need to first get a list of all Unique IDs in... (7 Replies)
Discussion started by: kiranredz
7 Replies

4. Shell Programming and Scripting

extracting row with max column value using awk or unix

Hello, BC106081_abc_128240811_128241377 7.96301 BC106081_abc_128240811_128241377 39.322 BC106081_cde_128240811_128241377 1.98628 BC106081_def_128240811_128241377 -2.44492 BC106081_abc_128240811_128241377 69.5504 FLJ00075_xyz_14406_16765 -0.173417 ... (3 Replies)
Discussion started by: Diya123
3 Replies

5. Shell Programming and Scripting

Filter and migrate data from row to column

Hello Experts, I am new in scripting. I would like to filter and migrate data from row to column by awk. Thanks in advance. For example FileA abc 1 2 3 Xyz3 4 1 5 bcd1 Output : Abc 1 2 3 Xyz3 4 1 5 bcd1 3 5 6 (5 Replies)
Discussion started by: shah09
5 Replies

6. Shell Programming and Scripting

Filter the column and print the result based on condition

Hi all This is my output of the some SQL Query TABLESPACE_NAME FILE_NAME TOTALSPACE FREESPACE USEDSPACE Free ------------------------- ------------------------------------------------------- ---------- --------- ---------... (2 Replies)
Discussion started by: jhon
2 Replies

7. Shell Programming and Scripting

repeated column data filter and make as a row

I need to get the output in row wise for the repeated column data Ex: Input: que = five ans = 5 que = six ans = 6 Required output: que = five six ans = 5 6 Any body can guide me?"""""" (2 Replies)
Discussion started by: vasanth_vadalur
2 Replies

8. Shell Programming and Scripting

Insert comma based on max number of column

Hi, I am new to unix shell shell scripting. I have a specific requirement where I need to append comma's based on the max number of column in the file. Eg: If my source file look something like this, sengwa,china tom,america,northamerica smith,america walter My output file... (8 Replies)
Discussion started by: nicholas_ejn
8 Replies

9. Shell Programming and Scripting

intent: df -kh | filter based on capacity (used space) column where % > 85

I want to accomplish this in sh, however if the capability exists only in other shells elsewhere that's acceptable. % df -kh Filesystem size used avail capacity Mounted on ... /dev/dsk/c0t0d0s1 103G 102G 23M 100% /export/DISK15 ... # output... (5 Replies)
Discussion started by: ProGrammar
5 Replies

10. Shell Programming and Scripting

filter based on column value

I have a file with colon separated values.. the sample is attached below. No of fields in each record/line is dependent on the value of field53. What I need to do is to design a special filter based on specific requirement of some match of values in particular column or combination of columns. ... (2 Replies)
Discussion started by: rraajjiibb
2 Replies
Login or Register to Ask a Question