Sum value from selected lines script (awk,perl) | Unix Linux Forums | Shell Programming and Scripting

  Go Back    


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

Sum value from selected lines script (awk,perl)

Shell Programming and Scripting


Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 10-10-2009
paolfili paolfili is offline
Registered User
 
Join Date: Oct 2009
Last Activity: 6 March 2013, 7:47 AM EST
Posts: 9
Thanks: 0
Thanked 0 Times in 0 Posts
Sum value from selected lines script (awk,perl)

Hello.
I face this (2 side) problem.

Some lines with this structure.

...........
12345678 4
12345989 13
12346356 205
12346644 74
12346819 22
.........


The first field (timestamp) is growing (or at least equal).

1)Sum the second fields if the first_field/500 are equals.
2)Sum the second fields if the difference between first fields is less than 500.
(sliding window)

In the example presented.

1) Becouse 12345678/500 and 12345989/500 both result 24691 sum=4+13
We cannot group the 3rd line so sum=205
And we group the 4th and 5th line so sum=74+22

2) We group the 1st and 2nd line becouse 12345989 - 12345678 < 500
For analogy we group the 2nd and 3th, the 3rd and 4th,
and the 3rd,4th and 5th becouse 12346819 (of the 5th line) - 12346356 (of the 3th line) < 500


Is there any (perl,awk,etc...) way to do it?

Thanks

Paolo
Sponsored Links
    #2  
Old 10-10-2009
thegeek thegeek is offline
Read Only
 
Join Date: Apr 2009
Last Activity: 30 July 2012, 5:38 AM EDT
Location: /usr/bin/vim
Posts: 946
Thanks: 13
Thanked 38 Times in 36 Posts
In Perl it is very simple to do.

But before i work for you, i would want to know what you have tried so far ?!

You should try, and ask for clarifications/advices if you have some difficulties -- which is always good to learn.
Sponsored Links
    #3  
Old 10-10-2009
paolfili paolfili is offline
Registered User
 
Join Date: Oct 2009
Last Activity: 6 March 2013, 7:47 AM EST
Posts: 9
Thanks: 0
Thanked 0 Times in 0 Posts
I know little awk and some elements of perl.


awk '{if ($1/500 > last_time_frame) { sum = $2 } else { sum+=$2;print sum };last_time_frame=$1/500;print sum}' AAAA.txt

No way
    #4  
Old 10-10-2009
durden_tyler's Avatar
durden_tyler durden_tyler is offline Forum Advisor  
Registered User
 
Join Date: Apr 2009
Last Activity: 24 October 2014, 3:29 AM EDT
Posts: 1,841
Thanks: 7
Thanked 266 Times in 241 Posts
Sorry, but the problem is not clear enough.

Quote:
Originally Posted by paolfili View Post
...
2)Sum the second fields if the difference between first fields is less than 500.
(sliding window)
What's the length of the sliding window ?

- Is it just 2 (1st & 2nd, 2nd & 3rd, 3rd & 4th, ...) ?
- Or is it 3 (1st, 2nd & 3rd; 2nd, 3rd & 4th; ...) ?

Hopefully, it's not a cartesian product, i.e.

1st vs. (2nd, 3rd, 4th, ... , last_row)
2nd vs. (1st, 3rd, 4th, ... , last_row)
3rd vs. (1st, 2nd, 4th, ... , last_row)
...
last_row vs. (1st, 2nd, 3rd, ..., last-1_row)

Quote:
...
1) Becouse 12345678/500 and 12345989/500 both result 24691 sum=4+13
We cannot group the 3rd line so sum=205
And we group the 4th and 5th line so sum=74+22
- Ok, and what do you want to do with the sum ?
- Do you want to display it ? Or do nothing with it (highly unlikely) ?
- If you want to display it, then how ? The total against each row ? Or the total against the first row only ? Or against the second row only ?

Quote:
...
and the 3rd,4th and 5th becouse 12346819 (of the 5th line) - 12346356 (of the 3th line) < 500
This begs the first counter-question. Why compare the 3rd, 4th and 5th (considering that you have been comparing two-at-a-time all this while) ?
So again, what's the length of the sliding window ?

I guess a very simple example of your input file should help here. So, let's say your input file is as follows:


Code:
$
$ cat f1
100 1
200 2
300 3
400 4
500 5
600 6
700 7
$

What do you want your output to look like ?

tyler_durden
Sponsored Links
    #5  
Old 10-10-2009
paolfili paolfili is offline
Registered User
 
Join Date: Oct 2009
Last Activity: 6 March 2013, 7:47 AM EST
Posts: 9
Thanks: 0
Thanked 0 Times in 0 Posts
???

Quote:
Originally Posted by durden_tyler View Post
Sorry, but the problem is not clear enough.



What's the length of the sliding window ?

- Is it just 2 (1st & 2nd, 2nd & 3rd, 3rd & 4th, ...) ?
- Or is it 3 (1st, 2nd & 3rd; 2nd, 3rd & 4th; ...) ?

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
The lenght is whatever the data requires 1,1000,1000000 of data samples
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>


Hopefully, it's not a cartesian product, i.e.

1st vs. (2nd, 3rd, 4th, ... , last_row)
2nd vs. (1st, 3rd, 4th, ... , last_row)
3rd vs. (1st, 2nd, 4th, ... , last_row)
...
last_row vs. (1st, 2nd, 3rd, ..., last-1_row)

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
No cartesian product
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

- Ok, and what do you want to do with the sum ?
- Do you want to display it ? Or do nothing with it (highly unlikely) ?
- If you want to display it, then how ? The total against each row ? Or the total against the first row only ? Or against the second row only ?

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Only print the value.
For the other work I' m on my own.;-)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

This begs the first counter-question. Why compare the 3rd, 4th and 5th (considering that you have been comparing two-at-a-time all this while) ?
So again, what's the length of the sliding window ?

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
TimeFrame concept.
Events in a Time Frame (ex.500 microseconds)
I need to sum events in a :
1)STATIC time frame evironment.
2)DYNAMIC time frame environment.(what in Digital Signal Processing area is defined as Sliding Windows)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

I guess a very simple example of your input file should help here. So, let's say your input file is as follows:


Code:
$
$ cat f1
100 1
200 2
300 3
400 4
500 5
600 6
700 7
$

What do you want your output to look like ?

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
1 case)
sum=1+2+3+4
sum=5+6+7

2 case)
sum=1+2+3+4
sum=2+3+4+5+6
sum=3+4+5+6+7
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

tyler_durden
>>>>>>>>>>>>>>

Paolo
Sponsored Links
    #6  
Old 10-11-2009
durden_tyler's Avatar
durden_tyler durden_tyler is offline Forum Advisor  
Registered User
 
Join Date: Apr 2009
Last Activity: 24 October 2014, 3:29 AM EDT
Posts: 1,841
Thanks: 7
Thanked 266 Times in 241 Posts
Thanks for the clarification.


Code:
$ 
$ cat f1
100 1   
200 2   
300 3   
400 4   
500 5   
600 6   
700 7   
$       
$ # Case 1
$ ##
$ perl -lane 'chomp;
>             if (int($F[0]/500) != $prev){print "Sum=$s"; $s = $F[1]}
>             else {$s += $F[1]}                                      
>             $prev = int($F[0]/500);                                 
>             END {print "Sum=$s"}' f1                                
Sum=10                                                                
Sum=18                                                                
$                                                                     
$                                                                     
$ # Case 2                                                            
$ ##
$ perl -lne 'chomp; push @x,$_;
>            END {
>              for($i=0; $i<=$#x; $i++){
>                ($x1,$x2) = split/ /,$x[$i];
>                $s = $x2;
>                for ($j=$i+1; $j<=$#x; $j++) {
>                  ($y1,$y2) = split/ /,$x[$j];
>                  if ($y1 - $x1 < 500) {$s += $y2}
>                  else {last}
>                }
>                print "Sum=$s";
>              }
>            }' f1
Sum=15
Sum=20
Sum=25
Sum=22
Sum=18
Sum=13
Sum=7
$
$

Sponsored Links
    #7  
Old 10-11-2009
Scrutinizer's Avatar
Scrutinizer Scrutinizer is offline Forum Staff  
Moderator
 
Join Date: Nov 2008
Last Activity: 24 October 2014, 8:44 PM EDT
Location: Amsterdam
Posts: 9,549
Thanks: 285
Thanked 2,426 Times in 2,174 Posts
Using awk:

Case1:

Code:
awk '{ sum1[int($1/500)]+=$2 } END { for (i in sum1) print "Sum1 "sum1[i] } ' infile

Case2:

Code:
awk 'BEGIN{
       min=1
     }
     { time[NR]=$1
       val[NR]=sum2[NR]=$2
       i=min
       while (time[NR]-time[i]>=500)
         i++
       min=i
       for (i=min;i<NR;i++)
         sum2[NR]+=val[i]
     }
     END {
       for (i in sum2)
         print "Sum2: "sum2[i]
     }' infile

Case1+2 combined:

Code:
awk 'BEGIN{
       min=1
     }
     { sum1[int($1/500)]+=$2
       time[NR]=$1
       val[NR]=sum2[NR]=$2
       i=min
       while (time[NR]-time[i]>=500)
         i++
       min=i
       for (i=min;i<NR;i++)
         sum2[NR]+=val[i]
     }
     END {
       for (i in sum1)
         print "Sum1 "sum1[i]
       print ""
       for (i in sum2)
         print "Sum2: "sum2[i]
     }' infile

Original testset:

Code:
Sum1 17
Sum1 205
Sum1 96

Sum2: 4
Sum2: 17
Sum2: 218
Sum2: 279
Sum2: 301

Additional testset:

Code:
Sum1 10
Sum1 18

Sum2: 1
Sum2: 3
Sum2: 6
Sum2: 10
Sum2: 15
Sum2: 20
Sum2: 25

Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Copy selected lines in vim coolavi UNIX for Dummies Questions & Answers 7 02-24-2010 03:22 AM
[Help] PERL Script - grep multiple lines miskin Shell Programming and Scripting 6 12-15-2008 06:16 AM
how to cut selected 10k lines continuosly vamshi UNIX for Dummies Questions & Answers 9 09-04-2008 01:16 PM
extracting selected few lines through perl paruthiveeran UNIX for Dummies Questions & Answers 2 07-16-2008 04:43 AM
print selected lines tonet Shell Programming and Scripting 6 10-08-2007 05:50 AM



All times are GMT -4. The time now is 09:26 PM.