## Sum value from selected lines script (awk,perl)

Sum value from selected lines script (awk,perl)
# 1
10-10-2009
Sum value from selected lines script (awk,perl)

Hello.
I face this (2 side) problem.

Some lines with this structure.

...........
12345678 4
12345989 13
12346356 205
12346644 74
12346819 22
.........

The first field (timestamp) is growing (or at least equal).

1)Sum the second fields if the first_field/500 are equals.
2)Sum the second fields if the difference between first fields is less than 500.
(sliding window)

In the example presented.

1) Becouse 12345678/500 and 12345989/500 both result 24691 sum=4+13
We cannot group the 3rd line so sum=205
And we group the 4th and 5th line so sum=74+22

2) We group the 1st and 2nd line becouse 12345989 - 12345678 < 500
For analogy we group the 2nd and 3th, the 3rd and 4th,
and the 3rd,4th and 5th becouse 12346819 (of the 5th line) - 12346356 (of the 3th line) < 500

Is there any (perl,awk,etc...) way to do it?

Thanks

Paolo
 paolfili View Public Profile for paolfili Find all posts by paolfili
# 2
10-10-2009
In Perl it is very simple to do.

But before i work for you, i would want to know what you have tried so far ?!

You should try, and ask for clarifications/advices if you have some difficulties -- which is always good to learn.
 thegeek View Public Profile for thegeek Find all posts by thegeek
# 3
10-10-2009
I know little awk and some elements of perl.

awk '{if (\$1/500 > last_time_frame) { sum = \$2 } else { sum+=\$2;print sum };last_time_frame=\$1/500;print sum}' AAAA.txt

No way
 paolfili View Public Profile for paolfili Find all posts by paolfili
# 4
10-10-2009
Sorry, but the problem is not clear enough.

Quote:
Originally Posted by paolfili
...
2)Sum the second fields if the difference between first fields is less than 500.
(sliding window)
What's the length of the sliding window ?

- Is it just 2 (1st & 2nd, 2nd & 3rd, 3rd & 4th, ...) ?
- Or is it 3 (1st, 2nd & 3rd; 2nd, 3rd & 4th; ...) ?

Hopefully, it's not a cartesian product, i.e.

1st vs. (2nd, 3rd, 4th, ... , last_row)
2nd vs. (1st, 3rd, 4th, ... , last_row)
3rd vs. (1st, 2nd, 4th, ... , last_row)
...
last_row vs. (1st, 2nd, 3rd, ..., last-1_row)

Quote:
...
1) Becouse 12345678/500 and 12345989/500 both result 24691 sum=4+13
We cannot group the 3rd line so sum=205
And we group the 4th and 5th line so sum=74+22
- Ok, and what do you want to do with the sum ?
- Do you want to display it ? Or do nothing with it (highly unlikely) ?
- If you want to display it, then how ? The total against each row ? Or the total against the first row only ? Or against the second row only ?

Quote:
...
and the 3rd,4th and 5th becouse 12346819 (of the 5th line) - 12346356 (of the 3th line) < 500
This begs the first counter-question. Why compare the 3rd, 4th and 5th (considering that you have been comparing two-at-a-time all this while) ?
So again, what's the length of the sliding window ?

I guess a very simple example of your input file should help here. So, let's say your input file is as follows:

What do you want your output to look like ?

tyler_durden
 durden_tyler View Public Profile for durden_tyler Find all posts by durden_tyler
# 5
10-10-2009
???

Quote:
Originally Posted by durden_tyler
Sorry, but the problem is not clear enough.

What's the length of the sliding window ?

- Is it just 2 (1st & 2nd, 2nd & 3rd, 3rd & 4th, ...) ?
- Or is it 3 (1st, 2nd & 3rd; 2nd, 3rd & 4th; ...) ?

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
The lenght is whatever the data requires 1,1000,1000000 of data samples
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Hopefully, it's not a cartesian product, i.e.

1st vs. (2nd, 3rd, 4th, ... , last_row)
2nd vs. (1st, 3rd, 4th, ... , last_row)
3rd vs. (1st, 2nd, 4th, ... , last_row)
...
last_row vs. (1st, 2nd, 3rd, ..., last-1_row)

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
No cartesian product
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

- Ok, and what do you want to do with the sum ?
- Do you want to display it ? Or do nothing with it (highly unlikely) ?
- If you want to display it, then how ? The total against each row ? Or the total against the first row only ? Or against the second row only ?

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Only print the value.
For the other work I' m on my own.;-)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

This begs the first counter-question. Why compare the 3rd, 4th and 5th (considering that you have been comparing two-at-a-time all this while) ?
So again, what's the length of the sliding window ?

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
TimeFrame concept.
Events in a Time Frame (ex.500 microseconds)
I need to sum events in a :
1)STATIC time frame evironment.
2)DYNAMIC time frame environment.(what in Digital Signal Processing area is defined as Sliding Windows)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

I guess a very simple example of your input file should help here. So, let's say your input file is as follows:

What do you want your output to look like ?

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
1 case)
sum=1+2+3+4
sum=5+6+7

2 case)
sum=1+2+3+4
sum=2+3+4+5+6
sum=3+4+5+6+7
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

tyler_durden
>>>>>>>>>>>>>>

Paolo
 paolfili View Public Profile for paolfili Find all posts by paolfili
# 6
10-11-2009
Thanks for the clarification.

 durden_tyler View Public Profile for durden_tyler Find all posts by durden_tyler
# 7
10-11-2009
Using awk:

Case1:
Case2:
Case1+2 combined:
Original testset:
 Scrutinizer View Public Profile for Scrutinizer Find all posts by Scrutinizer

## Shell script count lines and sum numbers from multiple files

I want to count the number of lines, I need this result be a number, and sum the last numeric column, I had done to make this one at time, but I need to make this for a crontab, so, it has to be an script, here is my lines: It counts the number of lines: egrep -i String file_name_201611* |...

## awk to sum column field from duplicate row/lines

Hello, I am new to Linux environment , I working on Linux script which should send auto email based on the specific condition from log file. Below is the sample log file Name m/c usage abc xxx 10 abc xxx 20 abc xxx 5 xyz ...

## Summing over specific lines and replacing the lines with the sum using sed, awk

Hi friends, This is sed & awk type question. I have a text file which has numbers spread all over the file. I want to sum the series of numbers whenever i find it and produce an output file with the sum. For example ###start of input text file #### abc def ghi 1 2 3 4 kjld random...

## AWK script - extracting min and max values from selected lines

Hi guys! I'm new to scripting and I need to write a script in awk. Here is example of file on which I'm working ATOM 4688 HG1 PRO A 322 18.080 59.680 137.020 1.00 0.00 ATOM 4689 HG2 PRO A 322 18.850 61.220 137.010 1.00 0.00 ATOM 4690 CD ...

## awk script for getting the selected records from a file.

Hello, I have attached one file named file.txt . I have to create a file using the awk script with the records in which 38th position is P and not V . ex it should have 00501 HOLTSVILLE NYP00501 and it should not include 00501 I R S SERVICE CENTER ...

## trying to print selected fields of selected lines by AWK

I am trying to print 1st, 2nd, 13th and 14th fields of a file of line numbers from 29 to 10029. I dont know how to put this in one code. Currently I am removing the selected lines by awk 'NR==29,NR==10029' File1 > File2 and then doing awk '{print \$1, \$2, \$13, \$14}' File2 > File3 Can...

## Perl script to find particular field and sum it

Hi, I have a file with format a b c d e 1 1 2 2 2 1 2 2 2 3 1 1 1 1 2 1 1 1 1 4 1 1 1 1 6 in column e i want to find all similar fields ( with perl script )and sum it how many are there for instance in format above. 2 - 2 times 4 - 1 time 6 - 1 time what i use is ...

## shell script(Preferably awk or sed) to print selected number of columns from each row

Hi Experts, The question may look very silly by seeing the title, but please have a look at it clearly. I have a text file where the first 5 columns in each row were supposed to be attributes of a sample(like sample name, number, status etc) and the next 25 columns are parameters on which...

## Sum of all lines in file without roundup with awk

Hi, I have a file and I want to sum all the numbers in it. Example of the file: 0.6714359 -3842.59553830551 I used your forum (https://www.unix.com/shell-programming-scripting/74293-how-get-sum-all-lines-file.html) and found a script, what worked for me: awk '{a+=\$0}END{print a}'...

## extracting selected few lines through perl

How can I extract few lines(like 10 to 15, top 10 and last 10) from a file using perl. I do it with sed, head and tail in unix scripting. I am new to perl. Appreciate your help.