Sum of numbers in three or more files

Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Sum of numbers in three or more files
# 8  
Old 10-10-2013
I try to avoid loops in awk to speed things up. If you have gnu awk, you can do this:
awk '{a+=$1} END {print a}' RS=" |\n" file?

If you like to store this into a variable do this:
var=$(awk '{a+=$1} END {print a}' RS=" |\n" file?)

PS, if you have awk on your system, why not use it?
This User Gave Thanks to Jotne For This Post:
# 9  
Old 10-10-2013
Originally Posted by Jotne
I try to avoid loops in awk to speed things up. If you have gnu awk, you can do this:
awk '{a+=$1} END {print a}' RS=" |\n" file?

PS, if you have awk on your system, why not use it?
I dont have awk and I am not fan of awk..Even now I dont understand that equation u wrote to me...Smilie prefer loops and statements...More clear for me..
# 10  
Old 10-10-2013
RS=" |\n" make a data in the file come out in separate lines, like
1 2 3 changes to

a+=$1 add all lines to variable a
print a prints the variable a
file? represent any file from file1 to file9

What system are you on?
# 11  
Old 10-10-2013
See my post in this forum
At the end it divides the sum to find the average.
This User Gave Thanks to MadeInGermany For This Post:
# 12  
Old 10-10-2013
Originally Posted by Jotne
I try to avoid loops in awk to speed things up. If you have gnu awk, you can do this:
awk '{a+=$1} END {print a}' RS=" |\n" file?

While there may be combinations of AWK implementation and operating system on which your suggestion is faster, I compared it against its predecessor on two combinations and yours was slower everytime.

$ seq 1000000 | paste - - - - - - - - - - > data

$ wc data
 100000 1000000 6888896 data

$ head -n5 data
1       2       3       4       5       6       7       8       9       10
11      12      13      14      15      16      17      18      19      20
21      22      23      24      25      26      27      28      29      30
31      32      33      34      35      36      37      38      39      40
41      42      43      44      45      46      47      48      49      50

$ tail -n5 data
999951  999952  999953  999954  999955  999956  999957  999958  999959  999960
999961  999962  999963  999964  999965  999966  999967  999968  999969  999970
999971  999972  999973  999974  999975  999976  999977  999978  999979  999980
999981  999982  999983  999984  999985  999986  999987  999988  999989  999990
999991  999992  999993  999994  999995  999996  999997  999998  999999  1000000

For each of the following results, the best of 5 runs was chosen.

Cygwin/GAWK 4.1.0:
$ time gawk '{for(i=1;i<=NF;i++)t+=$i} END {print t}' data

real    0m1.359s
user    0m1.327s
sys     0m0.015s

$ time gawk '{a+=$1} END {print a}' RS=' |\t|\n' data

real    0m2.797s
user    0m2.796s
sys     0m0.030s

Linux/MAWK 1.3.3:
$ time mawk '{for(i=1;i<=NF;i++)t+=$i} END {print t}' data

real    0m0.753s
user    0m0.640s
sys     0m0.032s

$ time mawk '{a+=$1} END {print a}' RS=' |\t|\n' data

real    0m1.346s
user    0m1.268s
sys     0m0.012s

In my opinion, unless there is a confirmed performance issue and unless the AWK implementation is known, unqualified AWK optimization tips are usually a bad idea (doubly so when advising a novice who is more likely to blindly internalize the advice).

Different awk implementations, and even different versions of the same implementation, implement differing sets of optimization strategies. One example I ran into recently: gawk lazily recomputes $0. As you probably know, POSIX requires recomputing $0 whenever a field is modified. gawk will not perform that recomputation until $0 is referenced (if at all). That optimization in effect:
$ time gawk '{for (i=1;i<=NF;i++) $i=""}' data

real    0m0.594s
user    0m0.593s
sys     0m0.030s

$ time mawk '{for (i=1;i<=NF;i++) $i=""}' data

real    0m1.039s
user    0m0.900s
sys     0m0.060s

Even though it is MAWK who has the speedy reputation, this version of GAWK is much faster because it doesn't recompute $0 after each $i="" (since $0 is never referenced after a field modification, it is never recomputed).

These 2 Users Gave Thanks to alister For This Post:
# 13  
Old 10-10-2013
This was very interesting, and an eye opener. I have never tested this, just thought i many be solver to run ting in loop. This prove it many be wrong.
Thanks for taking time to test. Smilie
This User Gave Thanks to Jotne For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Sum even numbers from 1 to 100

I need help with this assignment. I'm very new to using UNIX/LINUX, and my only previous experience with programing anything is using python. We are writing scripts using vim, and this one I'm stumped on. "Write a shell script that finds and display the sum of even positive integers from 0 to... (5 Replies)
Discussion started by: Nastybutler
5 Replies

2. Shell Programming and Scripting

Shell script count lines and sum numbers from multiple files

I want to count the number of lines, I need this result be a number, and sum the last numeric column, I had done to make this one at time, but I need to make this for a crontab, so, it has to be an script, here is my lines: It counts the number of lines: egrep -i String file_name_201611* |... (5 Replies)
Discussion started by: Elly
5 Replies

3. Shell Programming and Scripting

Sum up numbers in a for loop

Hi i have to calculate some numbers, column by column. Herfore i used a for-loop.. for i in {4..26};do awk -F"," '{x'$i'+=$'$i'}END{print '$i'"\t" x'$i'}' file.tmp;done ----- printout ----- 4 660905240 5 71205272 6 8.26169e+07 7 8.85961e+07 8 8.60936e+07 9 7.42238e+07 10 5.6051e+07... (7 Replies)
Discussion started by: IMPe
7 Replies

4. Shell Programming and Scripting

sum numbers of multiple files

Hi, I want to count the number of occurrences of numbers from a file of 6,000,000 lines. Because its too large, I decided to split the counts up in multiple files. So I have files of the counts of 5,000 lines. Now I want to add up the counts of all those files. The "counts file" looks like... (9 Replies)
Discussion started by: linseyr
9 Replies

5. Shell Programming and Scripting

Sum Numbers from different files

Hi All, I need to print the sum of numbers from different files. Input files: file1.out 10 20 30 file2.out 10 20 30 (5 Replies)
Discussion started by: saint2006
5 Replies

6. Shell Programming and Scripting

getting the sum of numbers

I basically have a file where I had to do a bunch of greps to get a list of numbers example: a file called numbers.txt 10000 10000 superman 10000 batman 10000 10000 grep '100' * | 10000 10000 10000 10000 10000 (2 Replies)
Discussion started by: zerofire123
2 Replies

7. Shell Programming and Scripting

Finding the sum of two numbers

cat *.out |grep "<some text>" | awk '{print $6}' For ex,This will reutrn me 11111 22222 is it possible to add these two numbers in the above given command itself?I can write this to a file and find the sum. But I prefer to this calculation in the above given line itself. Any... (3 Replies)
Discussion started by: prasperl
3 Replies

8. Shell Programming and Scripting

sum numbers in multiple files

I have 11 directories with around 200 files in each. In each directory the files are labeled out.0 through out.201 . Each file has around 118 numbers in a single column. I need to sum the files in each directory so each directory will have a resultant vector that is 118 numbers long. I then... (5 Replies)
Discussion started by: pattywac
5 Replies

9. Shell Programming and Scripting

sum numbers from stdout

hello im looking for short way to sum numbers from stdout the way i found to do it is to long for me i wander if there is shorter way to do it ok it 2 stage action this will make the list of number in to file sum.txt grep -c include *.c | awk '{l=split($0,a,":");print a;}' > sum.txt this... (1 Reply)
Discussion started by: umen
1 Replies

10. Shell Programming and Scripting

how to sum numbers in column

Hi, i want to sum all nubers in one column. Example: 12.23 11 23.01 3544.01 I'm trying to do this in awk, but it doesn't work properly. Seems like awk is summing only integers, for example: 12 11 23 3544 It cuts off numbers after dot. I used this command: akw /text/ file.txt |nawk... (1 Reply)
Discussion started by: iahveh
1 Replies
Login or Register to Ask a Question

Featured Tech Videos