Sum using awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Sum using awk
# 1  
Old 05-18-2010
Sum using awk

Hi all,

I need to sum values for fields in a delimited file as below:

Code:
 
2010-03-05|||
2010-03-05|||123
2010-03-05|467.621|369.532|
2010-03-06|||
2010-03-06||2|
2010-03-06|||444
2010-03-07|||
2010-03-07|||
2010-03-07|655.456|1019.301|

Code used is:
Code:
 
nawk -F "|" ' { sum[$1] += $2; sum1[$1] +=$3; sum2[$1] +=$4 } END { for (k in sum) print k "|" sum[k] "|" sum1[k] "|" sum2[k] }'

output required (and achieved):
Code:
 
2010-03-05|467.621|369.532|123
2010-03-06|0|2|444
2010-03-07|655.456|1019.301|

QUESTIONS:

1. When summing empty fields, how do I get the (sum) output value to reflect an empty field and not a zero?
2. This script needs to be used for multiple files with varying numbers of columns / fields. The above would require me to set up a seperate script for each input file and hardcode the number of columns (sections in blue) for every file. I would like to be able to write a single script to allow for any number of columns.

In searching for a solution, I understand that NF may work to do this but I have no idea of what the syntax should look like.

Can anyone assist?

Regards,

Bennie.

Moderator's Comments:
Mod Comment Use code tags please, ty.
# 2  
Old 05-18-2010
Code:
nawk -F\| 'END {
  for (K in k) {
    printf "%s", K FS  
    for (i = 1; ++i <= nf;)
      printf "%s", (v[K, i] ? v[K, i] : x) \
       (i < nf ? FS : RS) 
    }
  }
{
  for (i = 0; ++i <= NF;) {
    v[$1, i] += $i; k[$1]
    }
  NF > nf && nf = NF    
  }' infile

This User Gave Thanks to radoulov For This Post:
# 3  
Old 05-18-2010
At its base, you're going to need to establish some sort of pattern surrounding your requirements. Your sample appears to have sporadic values populated, but it also may be of mixed-precision. How many places do you want in these calculated values? Have you explored printf, as opposed to print? This allows you to apply masking symbols.

Insofar as the 0 sum columns, you could pipe the end result through a sed filter that eliminates the 0-string as needed.

You may wish to run it all through a case statement, or if..then loop, to apply different script versions according to your file's layout.

Last edited by curleb; 05-18-2010 at 11:04 AM..
# 4  
Old 05-18-2010
something like:

Code:
#  nawk -F "|" '{k=NF;t[$1]++;for (i=2;i<=NF;i++){s[$1"_"i]+=$i}}END{for (i in t){str=i;for (x=2;x<=k;x++){str=str"|"s[i"_"x]};print str}}' infile 
2010-03-05|467.621|369.532|123
2010-03-06|0|2|444
2010-03-07|655.456|1019.3|0

should be close...



just saw Radoulovs - post - yep - you're still the master...

Last edited by Tytalus; 05-18-2010 at 11:09 AM.. Reason: I see Radoulov has displayed his wizardry again
# 5  
Old 05-18-2010
Radoulov,

Just saw Tytalus's comment.

You are definitely the MASTER!

Works like a charm.

Thank you very much.
# 6  
Old 05-19-2010
Hi Radoulov,

Apologies for bothering again. On checking the output again, I noticed that ALL the fields with ZERO or "BLANK" values are printed as "BLANK". In our environment however, ZERO has a specific meaning / value and therefore, if the summerized value is truly ZERO, I need it to display as such.

Or in other words, if 4 values are summed (3 Blanks and one ZERO), the output needs to reflect ZERO and not blank.

Hope you are able to assist.

Bennie.
# 7  
Old 05-19-2010
Try this one:
Code:
nawk -F\| 'END {
  for (K in k) {
    printf "%s", K FS  
    for (i = 1; ++i <= nf;)
      printf "%s", ((K, i) in nn ? v[K, i] : x) \
        (i < nf ? FS : RS) 
    }
  }
{
  for (i = 1; ++i <= NF;) {
    v[$1, i] += $i
    $i == "" || nn[$1, i]
    }
  NF > nf && nf = NF; k[$1]  
    }' infile

This User Gave Thanks to radoulov For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk and sum

This is my file vol0 285GB vol0.snapshot 15GB vol11_root 0GB vol12_root 47GB vol12_root.snapshot 2GB I need the output vol0 285GB,vol0.snapshot 15GB,sum-300GB vol11_root 0GB,nosnap,sum-0Gb vol12_root 47GB,vol12_root.snapshot 2GB,49GB I was trying to use paste -d, --. But... (9 Replies)
Discussion started by: ranjancom2000
9 Replies

2. Shell Programming and Scripting

awk - Print Sum

Hi, I have an awk command that I am using, and part of it sums COL_9 however when I read the output it is not including decimal places; awk ' BEGIN{FS=OFS=","} NR==1{print;next} {a+=$9 c = $12 d = $18 } END{for(i in a) {split(i,b,";"); print $1, $2, $3, b, $5, $6, b, b, a, $10, $11,... (1 Reply)
Discussion started by: Ads89
1 Replies

3. Post Here to Contact Site Administrators and Moderators

awk to sum in Loop

i want code in awk with loop to get the sum * is delimiter in file TOTAL_AMOUNT=SUM(CLP04) suppose there are 12 CLP04 segment in my file i want to add upto 5 CLP04 then print next line after BPR segment after calculate the total amount CLP04 means ex ... (5 Replies)
Discussion started by: MOHANP12
5 Replies

4. Shell Programming and Scripting

How to sum the matrix using awk?

input A1 B1 A2 B2 0 0 1 1 1 0 0 1 0 1 1 0 1 1 1 1 Output label A1 B1 A2 B2 A1 2 1 1 2 B1 1 2 2 1 A2 1 2 3 2 B2 2 1 2 3 Ex: The number of times that A1 and B1 row values are both 1 should be printed as output. The last row of A1 and B1 in the input match by having 1 in both... (4 Replies)
Discussion started by: quincyjones
4 Replies

5. Shell Programming and Scripting

Sum value using sed or awk ?

Hello all, how would one go about writing a command using sed/awk that will give me an output that can sum up the number of time each user has done something and also add the amount of time... so output would be for example "smiths has run 3 process and for time taken of value: 224" ... (5 Replies)
Discussion started by: crazy_max
5 Replies

6. Shell Programming and Scripting

awk and count sum ?

I have a input.txt file which have 3 fields separate by a comma place, os and timediff in seconds tampa,win7, 2575 tampa,win7, 157619 tampa,win7, 3352 dallas,vista,604799 greenbay,winxp, 14400 greenbay,win7 , 518400 san jose,winxp, 228121 san jose,winxp, 70853 san jose,winxp, 193514... (5 Replies)
Discussion started by: sabercats
5 Replies

7. UNIX for Dummies Questions & Answers

awk question about sum

Hi everyone, i need help with a simple task. I have a file withe the format: "01/20/2012 23:10:13.979","49","49","48","19" "01/20/2012 23:15:13.969","47","47","48","18" "01/20/2012 23:20:13.975","47","47","45","17" "01/20/2012 23:25:13.980","44","44","44","17" "01/20/2012... (3 Replies)
Discussion started by: civilianwarfare
3 Replies

8. Shell Programming and Scripting

scripting/awk help : awk sum output is not comming in regular format. Pls advise.

Hi Experts, I am adding a column of numbers with awk , however not getting correct output: # awk '{sum+=$1} END {print sum}' datafile 2.15291e+06 How can I getthe output like : 2152910 Thank you.. # awk '{sum+=$1} END {print sum}' datafile 2.15079e+06 (3 Replies)
Discussion started by: rveri
3 Replies

9. Shell Programming and Scripting

Sum with awk

Hi,consider this fields, $1 $2 $3 981 0 1 984 0 4 985 1 0 987 0 2 990 0 0 993 0 3 995 2 0 996 0 1 999 0 4 for each occurence of zero in column $2 and $3 I need to sum $1 fields, so for example, in this piece of code the result of $1 is 8910. I'm sure... (2 Replies)
Discussion started by: cv313x
2 Replies

10. Shell Programming and Scripting

awk sum columns

can anyone help me how do i add the colums using awk seperated by character @. for eg i have 3@4 2@9 5@1 the result should be 10 14 i tried using { sum+= $1 } END { print sum } but it just gives the result 10. can anyone help me with this one thank you and best regards (7 Replies)
Discussion started by: phone_book
7 Replies
Login or Register to Ask a Question