Visit Our UNIX and Linux User Community


Evaluate 2 columns, add sum IF two columns match on two rows


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Evaluate 2 columns, add sum IF two columns match on two rows
# 1  
Old 08-07-2013
Code Evaluate 2 columns, add sum IF two columns match on two rows

Hi all, I know this sounds suspiciously like a homework course; but, it is not.

My goal is to take a file, and match my "ID" column to the "Date" column, if those conditions are true, add the total number of minutes worked and place it in this file, while not printing the original rows that I added together to obtain the sum. I've tried a lot of commands I found on the forum (mainly using awk) and tried piecing my own answer together but can't seem to quite get it. I have One file that looks equivalent to:

ID#,Date,Minutes
100,5/20/2013,22
101,7/21/2013,33
101,7/21/2013,73
101,7/21/2013,73

101,7/23/2013,26
102,7/24/2013,43

The net result I'd like to achieve is:

ID#,Date,Minutes
100,5/20/2013,22
101,7/21/2013,179
101,7/23/2013,26
102,7/24/2013,43

I have two separate Perl scripts which take different files, chops up the columns from those files and combines them to this one file, but I don't know where to go from here to achieve the results I'm looking for. The closest command I've come across is:

Code:
awk '{last=$2}{if(last == $3) getline;print}' f.csv

Which sounds KIND of like what I'm trying to do, to compare the rows/columns to each other and print the results, but I don't understand how to produce the sum if the condition is true.

Is using awk even appropriate? I sure would appreciate any help to go in the right direction.
# 2  
Old 08-07-2013
Try this:
Code:
awk -F, 'NR==1; NR>1 {Arr[$1","$2]+=$3} END{for (i in Arr) print i","Arr[i]}' file
ID#,Date,Minutes
100,5/20/2013,22
102,7/24/2013,43
101,7/23/2013,26
101,7/21/2013,179

This User Gave Thanks to RudiC For This Post:
# 3  
Old 08-07-2013
Rudi!!! You are awesome, that is exactly what I was looking for. If I may, can I ask a question about the line?

I know from researching the little I do know about "awk" that:

-F tells the field separator
NR is number of rows

But I does the rest of the statement work? Sorry, I know it's trivial, but I do like to learn (and already have learned a lot from scanning the forums even Smilie)

Thank you again!
# 4  
Old 08-07-2013
This may change the order of the lines
Code:
awk -F, '{a[$1","$2]+=$3} END {for (i in a) print i","a[i]}'
100,5/20/2013,22
102,7/24/2013,43
101,7/23/2013,26
101,7/21/2013,179

Edit: Hmm, this was nearly the same as Rudic posted.
This User Gave Thanks to Jotne For This Post:
# 5  
Old 08-07-2013
Great, thanks Jotne, I sure appreciate your answer as well! I don't mind about the rows being rearranged... the "sort" command is absolutely divine!!

---------- Post updated at 09:27 AM ---------- Previous update was at 09:20 AM ----------

Is there a way I can change the topic of this thread to something like "[ANSWERED]" so that others can see this thread was answered when scanning the forums?
# 6  
Old 08-07-2013
Code:
awk -F, 'NR==1;                                 # print line #1 as is (pattern is true; print is default action)
         NR>1 {Arr[$1","$2]+=$3}                # create (or use if exists) array indexed by $1,$2 and sum $3 into it
         END{for (i in Arr) print i","Arr[i]}   # At EOF, run i through the indices of array, print it, and the array element poited to by i
        ' file

This User Gave Thanks to RudiC For This Post:
# 7  
Old 08-07-2013
Rudi, thanks for taking your time to explain that. I sure appreciate it.

Previous Thread | Next Thread
Test Your Knowledge in Computers #126
Difficulty: Easy
Linux distro is an OS created from a collection of software built upon the Linux kernel.
True or False?

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Group by columns and add sum in new columns

Dear Experts, I have input file which is comma separated, has 4 columns like below, BRAND,COUNTRY,MODEL,COUNT NIKE,USA,DUMMY,5 NIKE,USA,ORIGINAL,10 PUMA,FRANCE,DUMMY,20 PUMA,FRANCE,ORIGINAL,15 ADIDAS,ITALY,DUMMY,50 ADIDAS,ITALY,ORIGINAL,50 SPIKE,CHINA,DUMMY,1O And expected output add... (2 Replies)
Discussion started by: ricky1991
2 Replies

2. UNIX for Dummies Questions & Answers

Merge rows into one if first 2 columns match

Hi, I wanted to merge the content and below is input and required output info. Input: /hello,a,r /hello,a,L /hello,a,X /hi,b,v /hi,b,c O/p: /hello,a,r:L:X /hi,v,:v:c Use code tags, thanks. (6 Replies)
Discussion started by: ankitas
6 Replies

3. Shell Programming and Scripting

Evaluate 2 columns, add sum IF two columns satisfy the condition

HI All, I'm embedding SQL query in Script which gives following output: Assignee Group Total ABC Group1 17 PQR Group2 5 PQR Group3 6 XYZ Group1 10 XYZ Group3 5 I have saved the above output in a file. How do i sum up the contents of this output so as to get following output: ... (4 Replies)
Discussion started by: Khushbu
4 Replies

4. Shell Programming and Scripting

Request: How to Parse dynamic SQL query to pad extra columns to match the fixed number of columns

Hello All, I have a requirement in which i will be given a sql query as input in a file with dynamic number of columns. For example some times i will get 5 columns, some times 8 columns etc up to 20 columns. So my requirement is to generate a output query which will have 20 columns all the... (7 Replies)
Discussion started by: vikas_trl
7 Replies

5. Shell Programming and Scripting

Add sum of columns and max as new row

Hi, I am a new bie i need some help with respect to shell onliner; I have data in following format Name FromDate UntilDate Active Changed Touched Test 28-03-2013 28-03-2013 1 0.6667 100 Test2 28-03-2013 03-04-2013 ... (1 Reply)
Discussion started by: gangaraju6
1 Replies

6. Shell Programming and Scripting

Rows to Columns with match criteria

Hello Friends, I have a input file having hundreds of rows. I want them to translate in to columns if column 1 is same. Input data: zp06 xxx zp06 rrr zp06 hhh zp06 aaa zp06 ggg zp06 qwer zp06 ser zl11 old3 zl11 old4 zl11 old5 zl11 old6 zl11 old7 zm14 luri zm14 body zm14 ucp (9 Replies)
Discussion started by: suresh3566
9 Replies

7. Shell Programming and Scripting

Compare 2 csv files by columns, then extract certain columns of matcing rows

Hi all, I'm pretty much a newbie to UNIX. I would appreciate any help with UNIX coding on comparing two large csv files (greater than 10 GB in size), and output a file with matching columns. I want to compare file1 and file2 by 'id' and 'chain' columns, then extract exact matching rows'... (5 Replies)
Discussion started by: bkane3
5 Replies

8. Shell Programming and Scripting

Get the SUM of TWO columns SEPARATELY by doing GROUP BY on other columns

My File looks like: "|" -> Field separator A|B|C|100|1000 D|E|F|1|2 G|H|I|0|7 D|E|F|1|2 A|B|C|10|10000 G|H|I|0|7 A|B|C|1|100 D|E|F|1|2 I need to do a SUM on Col. 5 and Col.6 by grouping on Col 1,2 & 3 My expected output is: A|B|C|111|11100 (2 Replies)
Discussion started by: machomaddy
2 Replies

9. Shell Programming and Scripting

Sum of range of rows and columns in matrix

Hi all, I have a large matrix of 720 x 25. I want to get sum of range of rows and columns. Like, I need sum of all columns and row number 2 to 21, then leaving 22nd row, again sum of all columns and row number 23 to 42 again leaving 43rd row and then sum of 44th to 63. Means I want to add all... (4 Replies)
Discussion started by: CAch
4 Replies

10. Shell Programming and Scripting

Deleting all the fields(columns) from a .csv file if all rows in that columns are blanks

Hi Friends, I have come across some files where some of the columns don not have data. Key, Data1,Data2,Data3,Data4,Data5 A,5,6,,10,, A,3,4,,3,, B,1,,4,5,, B,2,,3,4,, If we see the above data on Data5 column do not have any row got filled. So remove only that column(Here Data5) and... (4 Replies)
Discussion started by: ks_reddy
4 Replies

Featured Tech Videos