how can i group by same columns by another columns in Bash


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting how can i group by same columns by another columns in Bash
# 1  
Old 02-28-2011
how can i group by same columns by another columns in Bash

how can i group by same columns by another columns in Bash

Eq. this is a csv file
Code:
Co1 Co2 Co3    Co4
A     A     1,000  1,000
A     B     2,000  1,250
A     A     2,000  3,002
A     C     2,000  3,005


how can i get the result of like this

Code:
Co1 Co2 Co3    Co4
A     A     3,000  4,002
A     B     2,000  1,250
A     C     2,000  3,005


Last edited by Franklin52; 02-28-2011 at 03:53 AM.. Reason: Please use code tags, thank you
# 2  
Old 02-28-2011
Code:
 echo "Co1 Co2 Co3    Co4
A     A     1,000  1,000
A     B     2,000  1,250
A     A     2,000  3,002
A     C     2,000  3,005" |awk 'NR==1{print;next}
{a[$1 FS $2]+=gensub(",","",1,$3);b[$1 FS $2]+=gensub(",","",1,$4)}
END{for(i in a) print i,gensub(/(.)(.*)/,"\\1,\\2",1,a[i]),gensub(/(.)(.*)/,"\\1,\\2",1,b[i])}'
Co1 Co2 Co3    Co4
A A 3,000 4,002
A B 2,000 1,250
A C 2,000 3,005

# 3  
Old 02-28-2011
Remove the ',' from your file and apply this command.

Code:
awk 'NR==1{print;next} {arr1[$2]=arr1[$2] + $3;arr2[$2]= arr2[$2] + $4;} END{for (x in arr1) print "A  "x"  " arr1[x]"  "arr2[x]}'

# 4  
Old 02-28-2011
Code:
# cat file
Co1 Co2 Co3    Co4
A     A     1,000  1,000
A     B     2,000  1,250
A     A     2,000  3,002
A     C     2,000  3,005
# awk '{$1=$1;gsub(/,/,".")}NR==1{print;next}{a1[$1 OFS $2]+=$3;a2[$1 OFS $2]+=$4}END{for(_ in a1){printf "%s\t%.3f\t%.3f\n",_,a1[_],a2[_]}}' OFS='\t' file
Co1     Co2     Co3     Co4
A       A       3.000   4.002
A       B       2.000   1.250
A       C       2.000   3.005

# 5  
Old 02-28-2011
Thank u very much!
these is an other problem for me,
I don't know the method of gensub' means

How can I printout the result with format "###,###,###"

Quote:
Originally Posted by yinyuemi
Code:
 echo "Co1 Co2 Co3    Co4
A     A     1,000  1,000
A     B     2,000  1,250
A     A     2,000  3,002
A     C     2,000  3,005" |awk 'NR==1{print;next}
{a[$1 FS $2]+=gensub(",","",1,$3);b[$1 FS $2]+=gensub(",","",1,$4)}
END{for(i in a) print i,gensub(/(.)(.*)/,"\\1,\\2",1,a[i]),gensub(/(.)(.*)/,"\\1,\\2",1,b[i])}'
Co1 Co2 Co3    Co4
A A 3,000 4,002
A B 2,000 1,250
A C 2,000 3,005

# 6  
Old 02-28-2011
Quote:
Originally Posted by qjlongs
Thank u very much!
these is an other problem for me,
I don't know the method of gensub' means

How can I printout the result with format "###,###,###"
The gensub function works only with gawk.
This User Gave Thanks to Franklin52 For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

To group the text (rows) by similar columns-names in a file

As part of some report generation, I've written a script to fetch the values from DB. But, unluckily, for certain Time ranges(1-9.99,10-19.99 etc), I don't have data in DB. In such cases, I would like to write zero (0) instead of empty. The desired output will be exported to csv file. ... (1 Reply)
Discussion started by: kumar_karpuram
1 Replies

2. UNIX for Beginners Questions & Answers

Group by columns and add sum in new columns

Dear Experts, I have input file which is comma separated, has 4 columns like below, BRAND,COUNTRY,MODEL,COUNT NIKE,USA,DUMMY,5 NIKE,USA,ORIGINAL,10 PUMA,FRANCE,DUMMY,20 PUMA,FRANCE,ORIGINAL,15 ADIDAS,ITALY,DUMMY,50 ADIDAS,ITALY,ORIGINAL,50 SPIKE,CHINA,DUMMY,1O And expected output add... (2 Replies)
Discussion started by: ricky1991
2 Replies

3. Shell Programming and Scripting

Average across multiple columns group by

Hi experts, I want to group by average, for multiple columns starting column $7 until NF, group by ($1-$5), please help For just 7th column, I can do awk ' NR>1{ arr += $7 count += 1 } END{ for (a in arr) { print a, arr/count ... (10 Replies)
Discussion started by: ritakadm
10 Replies

4. Shell Programming and Scripting

Get the SUM of TWO columns SEPARATELY by doing GROUP BY on other columns

My File looks like: "|" -> Field separator A|B|C|100|1000 D|E|F|1|2 G|H|I|0|7 D|E|F|1|2 A|B|C|10|10000 G|H|I|0|7 A|B|C|1|100 D|E|F|1|2 I need to do a SUM on Col. 5 and Col.6 by grouping on Col 1,2 & 3 My expected output is: A|B|C|111|11100 (2 Replies)
Discussion started by: machomaddy
2 Replies

5. Shell Programming and Scripting

Summing columns over group of lines

I have an input file that looks like: ID1 V1 ID2 V2 P1 P2 P3 P4 ..... n no. of columns 1 1 1 1 1.0000 1.0000 1.0000 1.0000 1 1 1 2 0.9999 0.8888 0.7777 0.6666 1 2 1 1 0.8888 0.7777 0.6666 0.5555 1 2 1 2 0.7777 0.6666 0.5555 0.4444 2 1 1 1 0.6666 0.5555 0.4444 0.3333 2 1 1 2 0.5555 0.4444... (4 Replies)
Discussion started by: sdp
4 Replies

6. Shell Programming and Scripting

Add the values in second and third columns with group by on first column.

Hi All, I have a pipe seperated file. I need to add the values in second and third columns with group by on first column. MYFILE_28012012_1115|47|173.90 MYFILE_28012012_1115|4|0.00 MYFILE_28012012_1115|6|22.20 MYFILE_28012012_1116|47|173.90 MYFILE_28012012_1116|4|0.00... (3 Replies)
Discussion started by: angshuman
3 Replies

7. UNIX for Advanced & Expert Users

Unix Bash: substitute columns in .csv using other .csv columns

Hi All, I have two .csv's input.csv having values as (7 columns) ABC,A19907103,ABC DEV YUNG,2.17,1000,2157,07/07/2006 XYZ,H00213850,MM TRUP HILL,38.38,580,23308,31/08/2010 output.csv having (25 columns) A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y... (4 Replies)
Discussion started by: abhivyas
4 Replies

8. Shell Programming and Scripting

Convert rows to columns group

Hi I have the input file following like this "AIX" "AIX 6.0" "AIX 7.0" "Redhat 8" "Redhat 9" "Redhat 5.0 Enterprise Linux" "Sun Solaris 9" "Sun Solaris 10", "Sun Microsystems" "Oracle" .................................Like this 2000 lines I need to convert this input into... (5 Replies)
Discussion started by: selvanraj
5 Replies

9. Shell Programming and Scripting

Single command for add 2 columns and remove 2 columns in unix/performance tuning

Hi all, I have created a script which adding two columns and removing two columns for all files. Filename: Cust_information_1200_201010.txt Source Data: "1","Cust information","123","106001","street","1-203 high street" "1","Cust information","124","105001","street","1-203 high street" ... (0 Replies)
Discussion started by: onesuri
0 Replies

10. UNIX for Advanced & Expert Users

Group by columns and get the counts

Hi Gurus, I have a file 1|usa|hh 2|usa|ll 3|usa|vg 4|usa|vg 5|usa|vg 6|usa|vg 7|usa|ll 8|uk|nn 9|uk|bb 10|uk|bb 11|kuwait|mm 12|kuwait|jkj 13|kuwait|mm 14|dubai|hh I want to group by last two columns and get the last two recs and count. (3 Replies)
Discussion started by: sumeet
3 Replies
Login or Register to Ask a Question