## Sum value from selected lines script (awk,perl)

# 1
10-10-2009
Hello.
I face this (2 side) problem.

Some lines with this structure.

...........
12345678 4
12345989 13
12346356 205
12346644 74
12346819 22
.........

The first field (timestamp) is growing (or at least equal).

1)Sum the second fields if the first_field/500 are equals.
2)Sum the second fields if the difference between first fields is less than 500.
(sliding window)

In the example presented.

1) Becouse 12345678/500 and 12345989/500 both result 24691 sum=4+13
We cannot group the 3rd line so sum=205
And we group the 4th and 5th line so sum=74+22

2) We group the 1st and 2nd line becouse 12345989 - 12345678 < 500
For analogy we group the 2nd and 3th, the 3rd and 4th,
and the 3rd,4th and 5th becouse 12346819 (of the 5th line) - 12346356 (of the 3th line) < 500

Is there any (perl,awk,etc...) way to do it?

Thanks

Paolo
# 2
10-10-2009
In Perl it is very simple to do.

But before i work for you, i would want to know what you have tried so far ?!

You should try, and ask for clarifications/advices if you have some difficulties -- which is always good to learn.
# 3
10-10-2009
I know little awk and some elements of perl.

awk '{if (\$1/500 > last_time_frame) { sum = \$2 } else { sum+=\$2;print sum };last_time_frame=\$1/500;print sum}' AAAA.txt

No way
# 4
10-10-2009
Sorry, but the problem is not clear enough.

Quote:
Originally Posted by paolfili
...
2)Sum the second fields if the difference between first fields is less than 500.
(sliding window)
What's the length of the sliding window ?

- Is it just 2 (1st & 2nd, 2nd & 3rd, 3rd & 4th, ...) ?
- Or is it 3 (1st, 2nd & 3rd; 2nd, 3rd & 4th; ...) ?

Hopefully, it's not a cartesian product, i.e.

1st vs. (2nd, 3rd, 4th, ... , last_row)
2nd vs. (1st, 3rd, 4th, ... , last_row)
3rd vs. (1st, 2nd, 4th, ... , last_row)
...
last_row vs. (1st, 2nd, 3rd, ..., last-1_row)

Quote:
...
1) Becouse 12345678/500 and 12345989/500 both result 24691 sum=4+13
We cannot group the 3rd line so sum=205
And we group the 4th and 5th line so sum=74+22
- Ok, and what do you want to do with the sum ?
- Do you want to display it ? Or do nothing with it (highly unlikely) ?
- If you want to display it, then how ? The total against each row ? Or the total against the first row only ? Or against the second row only ?

Quote:
...
and the 3rd,4th and 5th becouse 12346819 (of the 5th line) - 12346356 (of the 3th line) < 500
This begs the first counter-question. Why compare the 3rd, 4th and 5th (considering that you have been comparing two-at-a-time all this while) ?
So again, what's the length of the sliding window ?

I guess a very simple example of your input file should help here. So, let's say your input file is as follows:

What do you want your output to look like ?

tyler_durden
# 5
10-10-2009
???

Quote:
Originally Posted by durden_tyler
Sorry, but the problem is not clear enough.

What's the length of the sliding window ?

- Is it just 2 (1st & 2nd, 2nd & 3rd, 3rd & 4th, ...) ?
- Or is it 3 (1st, 2nd & 3rd; 2nd, 3rd & 4th; ...) ?

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
The lenght is whatever the data requires 1,1000,1000000 of data samples
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Hopefully, it's not a cartesian product, i.e.

1st vs. (2nd, 3rd, 4th, ... , last_row)
2nd vs. (1st, 3rd, 4th, ... , last_row)
3rd vs. (1st, 2nd, 4th, ... , last_row)
...
last_row vs. (1st, 2nd, 3rd, ..., last-1_row)

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
No cartesian product
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

- Ok, and what do you want to do with the sum ?
- Do you want to display it ? Or do nothing with it (highly unlikely) ?
- If you want to display it, then how ? The total against each row ? Or the total against the first row only ? Or against the second row only ?

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Only print the value.
For the other work I' m on my own.;-)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

This begs the first counter-question. Why compare the 3rd, 4th and 5th (considering that you have been comparing two-at-a-time all this while) ?
So again, what's the length of the sliding window ?

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
TimeFrame concept.
Events in a Time Frame (ex.500 microseconds)
I need to sum events in a :
1)STATIC time frame evironment.
2)DYNAMIC time frame environment.(what in Digital Signal Processing area is defined as Sliding Windows)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

I guess a very simple example of your input file should help here. So, let's say your input file is as follows:

What do you want your output to look like ?

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
1 case)
sum=1+2+3+4
sum=5+6+7

2 case)
sum=1+2+3+4
sum=2+3+4+5+6
sum=3+4+5+6+7
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

tyler_durden
>>>>>>>>>>>>>>

Paolo
# 6
10-11-2009
Thanks for the clarification.

# 7
10-11-2009
Using awk:

Case1:
Case2:
Case1+2 combined:
Original testset:
