Averaging help in awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Averaging help in awk
# 8  
Old 08-17-2011
Put this code into "script.awk":
Code:
NR==1{
  gsub(" +","\t")
  print
}
NR>1&&(NR-1)%360{
  for (i=3;i<=NF;i++){
    a[i]+=$i
  }
}
NR>1&&!((NR-2)%360){
  t=$1"\t"$2"\t"
}
NR>1&&!((NR-1)%360){
  printf t
  for (i=3;i<=NF;i++){
    printf "%.10e\t",a[i]/360
    a[i]=0
  }
  printf "\n"
}

Then run it like that:
Code:
awk -f script.awk input > output

It will change the format of some columns, and it will change the field delimiter to single TAB as keeping all the spaces intact is quite troublesome.
This User Gave Thanks to bartus11 For This Post:
# 9  
Old 08-26-2011
Hey Bartus,

for the averaging code you gave me I'm unsure as to what the code actually does, can you just breifly run through what the parts of the script do?


Code:
#!/bin/gawk -f

# this script is used to average the data into 60 minute intervals


NR==1{
  gsub(" +","\t")
  print
}
NR>1&&(NR-1)%720{
  for (i=3;i<=NF;i++){
    a[i]+=$i
  }
}
NR>1&&!((NR-2)%720){
  t=$1"\t"$2"\t"
}
NR>1&&!((NR-1)%720){
  printf t
  for (i=3;i<=NF;i++){
    printf "%.5e\t",a[i]/720
    a[i]=0
  }
  printf "\n"
}

# 10  
Old 08-26-2011
When analyzing my code I noticed that it is not giving accurate output... This should calculate the average properly:
Code:
NR==1{
  gsub(" +","\t")
  print
}
NR>1&&(NR-1)%720{
  for (i=3;i<=NF;i++){
    a[i]+=$i
  }
}
NR>1&&!((NR-2)%720){
  t=$1"\t"$2"\t"
}
NR>1&&!((NR-1)%720){
  for (i=3;i<=NF;i++){
    a[i]+=$i
  }
  printf t
  for (i=3;i<=NF;i++){
    printf "%.5e\t",a[i]/720
    a[i]=0
  }
  printf "\n"
}

In simple words, it is going through whole file and checks line numbers:
for line 1 it is printing the header;
for line 2 it is saving time into variable
for lines 2-720 it is summing the columns in appropriate array elements;
for line number 721 it is adding the last value of the range to the array elements, then it prints the time saved at line 2 and prints calculated average value.
Using modulo operator causes this steps to be performed again for line 722-1440, 1441 etc.
# 11  
Old 08-31-2011
Hi gd9629
I recognise the data you're working with (measurements of atmospheric carbon dioxide, methane and water vapour) - my day job is running instruments such as you have.

A couple of comments if I may;
It would be fairly standard procedure to average such data to a fixed timestamp rather than have it defined by the start of the data record eg hourly averages from minute 0 to minute 59. This simplifies comparison or combination of different data sets and takes care of pesky issues such as periods of missing data

You can reduce your computation time by ignoring the data which are either pointless or meaningless to average. PM me if you want to discuss further as that discussion would be very much off-topic

My own approach would be to reduce the data to a one minute averages, filtered vs the diagnostics information in $6 to $15 (again PM me if you want further clarification) and written to a daily or monthly file. You can then calculate averages and other stats for whatever periods you wish without having to revisit the rather large raw data files you started with (must be at least 350Mb per day?). You can also reformat to whatever suits you. I'd go with space or comma delimited and lose the exponents and a lot of excess digits

cheers
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to perform averaging of values for particular timestamp using awk or anythoing else??

I have a file of the form. 16:00:26,83.33 16:05:26,83.33 16:10:26,83.33 16:15:26,83.33 16:20:26,90.26 16:25:26,83.33 16:30:26,83.33 17:00:26,83.33 17:05:26,83.33 17:10:26,83.33 17:15:26,83.33 17:20:26,90.26 17:25:26,83.33 17:30:26,83.33 For the timestamp 16:00:00 to 16:55:00, I need to... (5 Replies)
Discussion started by: Saidul
5 Replies

2. Shell Programming and Scripting

Loop for row-wise averaging of multiple files using awk

Hello all, I need to compute a row-wise average of files with a single column based on the pattern of the filenames. I really appreciate any help on this. it would just be very difficult to do them manually as the rows are mounting to 100,000 lines. the filenames are as below with convention as... (2 Replies)
Discussion started by: ida1215
2 Replies

3. Shell Programming and Scripting

Averaging 3 files

Hi, I am trying to average the values from 3 files with the same format. They are very large files so I will describe the file and show some it of. Basically the file has 83 columns (with nearly 7000 rows). The first three columns are the same for each file while the remaining 80 are values... (3 Replies)
Discussion started by: kylle345
3 Replies

4. Shell Programming and Scripting

Hourly averaging using Awk

Hey all, I have a set of 5-second data as shown below. I need to find an hourly average of this data. date co2 25/06/2011 08:04:00 8.30 25/06/2011 08:04:05 8.31 25/06/2011 08:04:10 8.32 25/06/2011 08:04:15 8.33 25/06/2011 08:04:20 ... (5 Replies)
Discussion started by: gd9629
5 Replies

5. Shell Programming and Scripting

Averaging data every 30 mins using AWK

A happy Monday to you all, I have a .csv file which contains data taken every 5 seconds. I want to average these 5 second data points into 30 minute averages! date co2 25/06/2011 08:04 8.31 25/06/2011 08:04 8.32 25/06/2011 08:04 8.33... (18 Replies)
Discussion started by: gd9629
18 Replies

6. Shell Programming and Scripting

Averaging in increments using awk & head/tail

Hi, I only have a very limited understanding and experience with writing code and I was hoping I could get some help. I have a dataset of two columns (txt format, numbers in each row separated by a tab) Eg. 1 5 2 5 3 6 4 7 5 6 6 6 7 ... (5 Replies)
Discussion started by: Emred_Skye
5 Replies

7. UNIX for Dummies Questions & Answers

Averaging the rows using 'awk'

Dear all, I have the data in the following format. I want to do average of each NR= 5 (rows) for all the 3 ($1,$2, $3) columns and want to print average result in another file in the same format. I dont know how to write code for this in 'awk', can some one help me to write a code for this in... (1 Reply)
Discussion started by: arvindr
1 Replies

8. UNIX for Dummies Questions & Answers

Averaging

Hello all, I'm trying to perform an averaging procedure which selects a selection of rows, average the corresponding value, selects the next set of rows and average the corresponding values etc. The data below illustrates what I want to do. Given two columns (day and value), I want to... (2 Replies)
Discussion started by: Muhammad Rahiz
2 Replies

9. Shell Programming and Scripting

averaging column values with awk

Hello. Im just starting to learn awk so hang in there with me...I have a large text file formatted as such everything is in a single column ID001 value 1 value 2 value....n ID002 value 1 value 2 value... n I want to be able to calculate the average for values for each ID from the... (18 Replies)
Discussion started by: johnmillsbro
18 Replies

10. Shell Programming and Scripting

AWK - averaging $3 by info in $1

Hello, I have three columns of data of the format below: <name> <volume> <size> a 2 1.2 a 2 1.1 b 3 1.7 c 0.7 1.9 c 0.7 1.9 c 0.7 1.8 What I... (3 Replies)
Discussion started by: itisthus
3 Replies
Login or Register to Ask a Question