calculate the average of time series data using AWK

12-23-2008

Registered User

26, 0

Join Date: Nov 2008

Last Activity: 28 October 2012, 1:57 AM EDT

Posts: 26

Thanks Given: 0

Thanked 0 Times in 0 Posts

calculate the average of time series data using AWK

Hi,

I have two time series data (below) merged into a file.
t1 and t2 are in unit of second

I want to calculate the average of V1 every second and count how many times "1" in V2 is occur within a second

Input File:

t1 V1 t2 V2
10.000000 4.387413 10.139355302 1
10.100000 4.397372 10.252770182 1
10.200000 4.406951 10.398060182 1
10.300000 3.940732 10.515105302 1
10.400000 4.044359 10.645365302 1
10.500000 4.139778 10.768800182 1
10.600000 4.222087 10.929725222 1
10.700000 4.299174 11.106285302 1
10.800000 2.941378 11.216505302 1
10.900000 3.081282 11.324910182 1
11.000000 3.219284 11.626115222 1
11.100000 3.354575 11.822715302 1
11.200000 3.486347 11.968005302 1
11.300000 3.613792 12.107075222 1
11.400000 3.730119 12.233535302 1
11.500000 3.846800 12.377615222 1
11.600000 3.956768 12.494055302 1
11.700000 4.059215 12.642540182 1
11.800000 4.153333 12.742740182 1
11.900000 4.234293 12.853565222 1
12.000000 4.309844 13.093440182 1
12.100000 2.107283 13.209275222 1
12.200000 2.234828 13.343940182 1
12.300000 2.371988 13.471005302 1
12.400000 2.511328 13.635125222 1
12.500000 2.652041 13.824900182 1
12.600000 2.793317 13.955160182 1
12.700000 2.934348 14.082225302 1
12.800000 3.067364 14.185620182 1
12.900000 3.205592 14.302665302 1
13.000000 4.130738 14.421090182 1
13.100000 3.929949 14.707265222 1
13.200000 2.160613 14.828715302 1
13.300000 2.296229 14.938935302 1
13.400000 2.434470 15.114285302 1
13.500000 2.574528 15.242730182 1
13.600000 3.865811 15.485025302 1
13.700000 4.273357 15.660375302 1
13.800000 4.357861 15.895845302 1
13.900000 4.371735 16.034310182 1
14.000000 4.377158 16.150145222 1
..............
..............

Desired Output:

t1 V1 V2
10.000000 3.986053 7
11.000000 3.765453 6
12.000000 2.818793 7
13.000000 3.439529 7
...............
...............

Please, can anyone tell me AWK code for calculating this..??

Thanks

nica

View Public Profile for nica

Find all posts by nica

12-23-2008

Moderator

8,825, 1,112

Join Date: Feb 2005

Last Activity: 23 August 2021, 11:26 AM EDT

Location: Foxborough, MA

Posts: 8,825

Thanks Given: 579

Thanked 1,112 Times in 1,003 Posts

nawk -f nica.awk myFile

nica.awk:

Code:

{
   v1S[int($1)]+=$2
   v1N[int($1)]++

   if ($NF == "1") v2N[int($3)]++
}
END {
  for(i in v1S)
    printf("%.6f%s%.6f%s%d\n", i, OFS, v1S[i]/v1N[i], OFS, v2N[i])
}

Last edited by vgersh99; 12-23-2008 at 09:33 AM..

vgersh99

View Public Profile for vgersh99

Find all posts by vgersh99

12-24-2008

Registered User

26, 0

Join Date: Nov 2008

Last Activity: 28 October 2012, 1:57 AM EDT

Posts: 26

Thanks Given: 0

Thanked 0 Times in 0 Posts

Hi vgersh99,

Thanks for the code....

nica

View Public Profile for nica

Find all posts by nica

12-24-2008

Registered User

26, 0

Join Date: Nov 2008

Last Activity: 28 October 2012, 1:57 AM EDT

Posts: 26

Thanks Given: 0

Thanked 0 Times in 0 Posts

Hi,

I have another request..

here, V2 is a counter
I want to match V2 value with the V1 value with the same time series (t1) before V2 counter get reset to "1"..
e.g. 1,2,3,1 ---> I want to get value of V1 when value of V2 is "3" (before reset to "1")
or
e.g. 1,2,1,1,1, ---> I want to get the value of V1 when value of V2 are "2" followed by three consecutive "1"

Input File:

t1 V1 t2 V2
10.000000 4.387413 10.139355302 1
10.100000 4.397372 10.252770182 2
10.200000 4.406951 10.398060182 1
10.300000 3.940732 10.515105302 1
10.400000 4.044359 10.645365302 1
10.500000 4.139778 10.768800182 2
10.600000 4.222087 10.929725222 3
10.700000 4.299174 11.106285302 1
10.800000 2.941378 11.216505302 1
10.900000 3.081282 11.324910182 2
11.000000 3.219284 11.626115222 3
11.100000 3.354575 11.822715302 4
11.200000 3.486347 11.968005302 5
11.300000 3.613792 12.107075222 1
11.400000 3.730119 12.233535302 1
11.500000 3.846800 12.377615222 1
11.600000 3.956768 12.494055302 1
11.700000 4.059215 12.642540182 1
11.800000 4.153333 12.742740182 1
11.900000 4.234293 12.853565222 1
12.000000 4.309844 13.093440182 1
..............
..............

Desired Output:

t1 V1 V2
10.200000 4.406951 2
10.300000 3.940732 1
10.500000 4.139778 1
10.900000 3.081282 3
11.100000 3.354575 1
11.900000 4.234293 5
...............
...............

nica

View Public Profile for nica

Find all posts by nica

12-25-2008

Registered User

1,305, 26

Join Date: Jun 2007

Last Activity: 11 November 2016, 3:44 AM EST

Location: Beijing China

Posts: 1,305

Thanks Given: 0

Thanked 26 Times in 26 Posts

perl:

Code:

#! /usr/bin/perl -w
open FH,"<a.txt";
while(<FH>){
	@arr=split(" ",$_);
	if($arr[2]=~m/([0-9][0-9]*)\..*/){
		$hash{$1}->{NUM}++;
		$hash{$1}->{SUM}+=$arr[1];
	}
}
close FH;
for $key (sort keys %hash){
	printf("%s.000000 %.6f %s\n",$key,$hash{$key}->{SUM}/$hash{$key}->{NUM},$hash{$key}->{NUM});
}

awk:

Code:

awk '{
	key=substr($3,1,index($3,".")-1)
	arr[key]++
	brr[key]+=$2
}
END{
	for(i in arr)
		printf("%s.000000 %.6f %s\n",i,brr[i]/arr[i],arr[i])
}
' a.txt

Last edited by summer_cherry; 12-25-2008 at 06:17 AM..

summer_cherry

View Public Profile for summer_cherry

Find all posts by summer_cherry

01-02-2009

Registered User

26, 0

Join Date: Nov 2008

Last Activity: 28 October 2012, 1:57 AM EDT

Posts: 26

Thanks Given: 0

Thanked 0 Times in 0 Posts

Thanks summer_cherry,

But your code does not work what I expected.

Since V2 is a counter, I just want to take the highest count of V2.
for example:

1,1,1,2,3,1,2,1,1,1 = 1,1,3,2,1,1,1 (desired output for new V2)
1,2,3,4,5,6,7,1,1,1 = 7,1,1,1
1,2,3,4,1,1,1,2,1,2 = 4,1,1,2,2

Then, I want to collect the value of V1 and V2 (with highest count) which have same t1 and t2 (e.g. 10.200000 = 10.252770182 )

Input File: Desired Output:

t1 V1 t2 V2 t1 V1 V2
10.000000 4.387413 10.139355302 1
10.100000 4.397372 10.252770182 2 ------> 10.200000 4.406951 2
10.200000 4.406951 10.398060182 1 ------> 10.300000 3.940732 1
10.300000 3.940732 10.515105302 1 ------> 10.500000 4.139778 1
10.400000 4.044359 10.645365302 1
10.500000 4.139778 10.768800182 2
10.600000 4.222087 10.929725222 3 -------> 10.900000 3.081282 3
10.700000 4.299174 11.106285302 1 -------> 11.100000 3.354575 1
10.800000 2.941378 11.216505302 1
10.900000 3.081282 11.324910182 2
11.000000 3.219284 11.626115222 3
11.100000 3.354575 11.822715302 4
11.200000 3.486347 11.968005302 5 -------> 11.900000 4.234293 5
11.300000 3.613792 12.107075222 1 -------> ............
11.400000 3.730119 12.233535302 1 -------> ............
11.500000 3.846800 12.377615222 1 -------> ............
11.600000 3.956768 12.494055302 1 -------> ............
11.700000 4.059215 12.642540182 1 -------> ............
11.800000 4.153333 12.742740182 1 -------> ............
11.900000 4.234293 12.853565222 1 -------> ............
12.000000 4.309844 13.093440182 1 -------> ............ so on...
..............
..............

I hope you can understand what I expect..

Thanks

nica

View Public Profile for nica

Find all posts by nica

Shell Programming and Scripting

calculate the average of time series data using AWK

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Calculate Average time of one column

Discussion started by: Newman

2. Shell Programming and Scripting

Average of multiple time-stamped data every half hour

Discussion started by: terrychen

3. Programming

Resample time series data with replacement any way to do this in awk or just bash script

Discussion started by: malandisa

4. Shell Programming and Scripting

Calculate average for repeated ID within a data

Discussion started by: ENG_MOHD

5. Shell Programming and Scripting

Calculate Average AWK

Discussion started by: AriasFco

6. Shell Programming and Scripting

AWK novice - calculate the average

Discussion started by: alex2005

7. Shell Programming and Scripting

Calculate average time using a script

Discussion started by: jaredhanks

8. UNIX for Dummies Questions & Answers

Iterate a min/max awk script over time-series temperature data

Discussion started by: jgourley

9. HP-UX

calculate average of multiple line data

Discussion started by: smacherla

10. UNIX for Dummies Questions & Answers

Use awk to calculate average of column 3

Discussion started by: grossgermany