Hourly averaging using Awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Hourly averaging using Awk
# 1  
Old 07-28-2011
Hourly averaging using Awk

Hey all,

I have a set of 5-second data as shown below. I need to find an hourly average of this data.

Code:
date                         co2
25/06/2011 08:04:00        8.30
25/06/2011 08:04:05        8.31
25/06/2011 08:04:10        8.32
25/06/2011 08:04:15        8.33
25/06/2011 08:04:20        8.32
25/06/2011 08:04:25        8.31
25/06/2011 08:04:30        8.30
25/06/2011 08:04:35        8.29
25/06/2011 08:04:40        8.34
25/06/2011 08:04:45        8.43
25/06/2011 08:04:50        8.33
25/06/2011 08:04:55        8.32
25/06/2011 08:05:00        8.31
25/06/2011 08:05:05        8.32
25/06/2011 08:05:10        8.30
25/06/2011 08:05:15        8.29
25/06/2011 08:05:20        8.30
25/06/2011 08:05:25        8.31
25/06/2011 08:05:30        8.33
25/06/2011 08:05:35        8.32
25/06/2011 08:05:40        8.32
25/06/2011 08:05:45        8.35
25/06/2011 08:05:50        8.34
25/06/2011 08:05:55        8.36
25/06/2011 08:06:00        8.37
25/06/2011 08:06:05        8.34
25/06/2011 08:06:10        8.35
25/06/2011 08:06:15        8.35
25/06/2011 08:06:20        8.34
25/06/2011 08:06:25        8.38
25/06/2011 08:06:30        8.35
25/06/2011 08:06:35        8.36
25/06/2011 08:06:40        8.32
25/06/2011 08:06:45        8.31
25/06/2011 08:06:50        8.30
25/06/2011 08:06:55        8.32

So above is just 3 minutes worth of data! Is there a way to average the co2 values into hourly data?

The co2 data is in column 13 of my dataframe. I also data in column 15 that I need to hourly average too!

Thanks a lot!

Last edited by gd9629; 07-29-2011 at 06:06 AM..
# 2  
Old 07-28-2011
How about this?
Code:
 
awk '{a[substr($2,1,index($2,":")-1)]=a[substr($2,1,index($2,":")-1)]+$3} END{ for(i in a)print "hour->",i,"average->",a[i]/720}' input_file

# 3  
Old 07-28-2011
hey! This is my output

Code:
hour-> 17 average-> 0
hour-> 08 average-> 0
hour->  average-> 0
hour-> 18 average-> 0
hour-> 09 average-> 0
hour-> 19 average-> 0
hour-> 00 average-> 0
hour-> 01 average-> 0
hour-> 10 average-> 0
hour-> 11 average-> 0
hour-> 20 average-> 0
hour-> 02 average-> 0
hour-> 12 average-> 0
hour-> 21 average-> 0
hour-> 03 average-> 0
hour-> 04 average-> 0
hour-> 22 average-> 0
hour-> 13 average-> 0
hour-> 23 average-> 0
hour-> 05 average-> 0
hour-> 14 average-> 0
hour-> 06 average-> 0
hour-> 15 average-> 0
hour-> 16 average-> 0
hour-> 07 average-> 0

Also I need to edit the actual file. As in the second data needs to be deleted and replaced with the hourly averages i.e.

Code:
date                         co2
25/06/2011 08:00:00        8.30
25/06/2011 09:00:00        8.31
25/06/2011 10:00:00        8.32
25/06/2011 11:00:00        8.33

etc.
# 4  
Old 07-28-2011
Hello Not sure,

the same code I posted worked for me!!!

Try this:

Code:
 
awk '{a[substr($2,1,index($2,":")-1)]=a[substr($2,1,index($2,":")-1)]+$3} END{ for(i in a)print "hour->",i,"average->",a[i]/720.0}' input_file

Tested for the sample that I have , can't paste as it is too big here. Got the result satisfactory.

Code:
 
awk '{a[$1" "substr($2,1,index($2,":")-1)]=a[$1" "substr($2,1,index($2,":")-1)]+$3} END{ for(i in a)print i":00:00",a[i]/720.0}' input_file
01/01/2011 00:00:00 8.17293
01/01/2011 01:00:00 8.23
01/01/2011 02:00:00 8.31


Last edited by panyam; 07-28-2011 at 12:00 PM.. Reason: Added output that I got from sample
# 5  
Old 07-28-2011
Code:
awk '{split($2,a,":");b[$1 FS a[1]]+=$3;c[$1 FS a[1]]++}
    END{for (i in b) printf "%s:00:00\t%.2f\n", i,b[i]/c[i]}' infile

# 6  
Old 07-29-2011
@panyam this is the result the Shell gave me

Code:
09/06/2011 20:00:00 0
24/06/2011 13:00:00 0
21/06/2011 18:00:00 0
09/06/2011 21:00:00 0
24/06/2011 14:00:00 0
21/06/2011 19:00:00 0
09/06/2011 22:00:00 0
24/06/2011 15:00:00 0

(I know the dates are mixed up, I need to sort this out after)

@rdcwayx this is the result I got from your code

Code:
09/06/2011 20:00:00     0.00
24/06/2011 13:00:00     0.00
21/06/2011 18:00:00     0.00
09/06/2011 21:00:00     0.00
24/06/2011 14:00:00     0.00
21/06/2011 19:00:00     0.00
09/06/2011 22:00:00     0.00

Also the code has to actually edit the csv file.

Thanks guys

---------- Post updated at 10:33 AM ---------- Previous update was at 09:28 AM ----------

do you think it makes sense to sort out the dates/time before averaging the data. It does doesn't it. Here's my code so far

Code:
#!/bin/bash


# input every 'CF*nc.dat' file in 'picarro' folder and create combined 'june2011.dat' file
find /u/gd9629/private/Scripts/Gavin/picarro -type f -name "CF*nc.dat" -exec cat {} > /u/gd9629/private/Scripts/Gavin/Data/june2011.dat \;

# use 'june2011.dat' as input file to create .csv file
IN_all='/u/gd9629/private/Scripts/Gavin/Data/june2011.dat' 
	
# Output files
OUT_all='/u/gd9629/private/Scripts/Gavin/Awk/juneout.csv'		# the csv file to create for all data called 'june.csv' in the respective directory

# gawk files to create csv file
GAWK='/u/gd9629/private/Scripts/Gavin/Format.csv.awk'

# creating headers for the columns
echo "date,alarm,species,solenoid,mpv,outlet,cavp,cavt,warmbox,etalon,dastemp,co2sync,co2,ch4sync,ch4,h2osync" > $OUT_all    # create clean OUT_all file with headers

#produce the OUT file from the IN file(s)
$GAWK $IN_all >> $OUT_all 


#removes the interspersed headers
awk '!/^\/\//' /u/gd9629/private/Scripts/Gavin/Awk/juneout.csv > /u/gd9629/private/Scripts/Gavin/Awk/juneout2.csv

The problem is awk reads in the files in all the .dat files in the folder I've specified in no discernible manner ( or so it seems). Is there a way of telling awk to sort it by day and time?

I suppose this would make averaging the data easier as opposed to sorting it out afterwards
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to perform averaging of values for particular timestamp using awk or anythoing else??

I have a file of the form. 16:00:26,83.33 16:05:26,83.33 16:10:26,83.33 16:15:26,83.33 16:20:26,90.26 16:25:26,83.33 16:30:26,83.33 17:00:26,83.33 17:05:26,83.33 17:10:26,83.33 17:15:26,83.33 17:20:26,90.26 17:25:26,83.33 17:30:26,83.33 For the timestamp 16:00:00 to 16:55:00, I need to... (5 Replies)
Discussion started by: Saidul
5 Replies

2. Shell Programming and Scripting

Crontab on hourly basis

Hi.. I need to run the script on hourly basis. How do I write the crontab on hourly basis i.e, 9:00, 10:00.....22:00.. 23:00 hours Please let me know if the below is correct one for crontab on hourly basis. 00 * * * * ksh myscript.ksh > /dev/null Regards, John (3 Replies)
Discussion started by: scriptscript
3 Replies

3. Shell Programming and Scripting

Loop for row-wise averaging of multiple files using awk

Hello all, I need to compute a row-wise average of files with a single column based on the pattern of the filenames. I really appreciate any help on this. it would just be very difficult to do them manually as the rows are mounting to 100,000 lines. the filenames are as below with convention as... (2 Replies)
Discussion started by: ida1215
2 Replies

4. Shell Programming and Scripting

Averaging help in awk

Hi all, I have a data file like below, where Time is in the second column DATE TIME FRAC_DAYS_SINCE_JAN1 2011-06-25 08:03:20.000 175.33564815 2011-06-25 08:03:25.000 175.33570602... (10 Replies)
Discussion started by: gd9629
10 Replies

5. Shell Programming and Scripting

Averaging data every 30 mins using AWK

A happy Monday to you all, I have a .csv file which contains data taken every 5 seconds. I want to average these 5 second data points into 30 minute averages! date co2 25/06/2011 08:04 8.31 25/06/2011 08:04 8.32 25/06/2011 08:04 8.33... (18 Replies)
Discussion started by: gd9629
18 Replies

6. Shell Programming and Scripting

Averaging in increments using awk & head/tail

Hi, I only have a very limited understanding and experience with writing code and I was hoping I could get some help. I have a dataset of two columns (txt format, numbers in each row separated by a tab) Eg. 1 5 2 5 3 6 4 7 5 6 6 6 7 ... (5 Replies)
Discussion started by: Emred_Skye
5 Replies

7. UNIX for Dummies Questions & Answers

Averaging the rows using 'awk'

Dear all, I have the data in the following format. I want to do average of each NR= 5 (rows) for all the 3 ($1,$2, $3) columns and want to print average result in another file in the same format. I dont know how to write code for this in 'awk', can some one help me to write a code for this in... (1 Reply)
Discussion started by: arvindr
1 Replies

8. UNIX for Dummies Questions & Answers

Maintaining HOURLY backups

I have a system where i take hourly back-ups of the system.The script for maintaining full backup for the last 5 days is find /backup/server -type f -mtime +4 -exec rm -f {} \; works fine for keeping the files of some 5 days old. In the case of hourly backups.How do we write to keep... (2 Replies)
Discussion started by: ravi55055
2 Replies

9. Shell Programming and Scripting

averaging column values with awk

Hello. Im just starting to learn awk so hang in there with me...I have a large text file formatted as such everything is in a single column ID001 value 1 value 2 value....n ID002 value 1 value 2 value... n I want to be able to calculate the average for values for each ID from the... (18 Replies)
Discussion started by: johnmillsbro
18 Replies

10. Shell Programming and Scripting

AWK - averaging $3 by info in $1

Hello, I have three columns of data of the format below: <name> <volume> <size> a 2 1.2 a 2 1.1 b 3 1.7 c 0.7 1.9 c 0.7 1.9 c 0.7 1.8 What I... (3 Replies)
Discussion started by: itisthus
3 Replies
Login or Register to Ask a Question