Idea Required for such findings


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Idea Required for such findings
# 1  
Old 03-04-2009
Idea Required for such findings

Dear All,
I have a file containing
2009-03-04-0001,144,144,0,0,0,0,0,0,0,50,50
2009-03-04-0002,194,194,0,0,0,0,0,0,0,40,40
.........
2009-03-04-0059,134,134,0,0,0,0,0,0,0,60,60
2009-03-04-0100,144,144,0,0,0,0,0,0,0,55,55
...........
2009-03-04-0159,244,244,0,0,0,0,0,0,0,75,75
.......
2009-03-04-2359,244,244,0,0,0,0,0,0,0,75,75

That is each minute represent a line.

Now if I need to find out sum of last parameter for a particular hour what will be the methodology. As a beginner this seems very hard to me. So I am seeking some idea from the experts.

Thank you.
# 2  
Old 03-04-2009
First, sort the input according to the hour. Next, split the file into fields delimited by , (comma) and maybe also - (dash). Get the hour number. If the hour is the same as the previous line's hour, add the last field with the sum. If not, print out the current sum, re-initialize the sum to the last field, and remember the new hour.
Code:
sort -t - -k 4.1n,4.2 |
awk -F '[-,]' 'BEGIN { last_hour=-1; } { 
 hour=int($4/100); 
 if (last_hour < 0 || last_hour == hour) sum+=$NF; 
 else { print last_hour,sum; sum=$NF;  } 
 last_hour=hour; 
}'


Last edited by otheus; 03-04-2009 at 09:18 AM.. Reason: fmt
# 3  
Old 03-04-2009
Dear otheus,
Thanks for this nice way.. But I can't get the sort option and where is the mentioning of input file?
# 4  
Old 03-04-2009
Just cat the input file to the command, or add the filename to the sort command.

The sort option sorts on the first two characters of the fourth field, where the fields are delimited by - (the -t option does this).
# 5  
Old 03-05-2009
Input file:
Code:
$ cat ffile
2009-03-04-0001,144,144,0,0,0,0,0,0,0,50,50
2009-03-04-0002,194,194,0,0,0,0,0,0,0,40,40
2009-03-04-0059,134,134,0,0,0,0,0,0,0,60,60
2009-03-04-0100,144,144,0,0,0,0,0,0,0,55,55
2009-03-04-0159,244,244,0,0,0,0,0,0,0,75,75
2009-03-04-2359,244,244,0,0,0,0,0,0,0,75,75

Quote:
Originally Posted by otheus
First, sort the input according to the hour. Next, split the file into fields delimited by , (comma) and maybe also - (dash). Get the hour number. If the hour is the same as the previous line's hour, add the last field with the sum. If not, print out the current sum, re-initialize the sum to the last field, and remember the new hour.
Code:
sort -t - -k 4.1n,4.2 ffile |
awk -F '[-,]' 'BEGIN { last_hour=-1; } { 
 hour=int($4/100); 
 if (last_hour < 0 || last_hour == hour) sum+=$NF; 
 else { print last_hour,sum; sum=$NF;  } 
 last_hour=hour; 
}'

The above fails with below error:

Code:
awk: syntax error near line 1
awk: bailing out near line 1

However, this works:
Code:
$ awk -F"," '{hr=substr($1,12,2); sum[hr]+=$NF}END{for(i in sum){print "hour", i, sum[i]}}' ffile |sort -n
hour 00 150
hour 01 130
hour 23 75

# 6  
Old 03-05-2009
Dear rikxik,
In your code there is a limitation I think. That is the last parameter of a line may be of 2 or 3 digits. So if it is 3 what will be the outcome?
# 7  
Old 03-05-2009
It can be all the digits for all I care - it should still work. E.g.:

I've added some lines with more than 2 digits in last column:
Code:
$ cat ffile
2009-03-04-0001,144,144,0,0,0,0,0,0,0,50,50
2009-03-04-0002,194,194,0,0,0,0,0,0,0,40,40
2009-03-04-0059,134,134,0,0,0,0,0,0,0,60,60
2009-03-04-0100,144,144,0,0,0,0,0,0,0,55,55
2009-03-04-0159,244,244,0,0,0,0,0,0,0,75,75
2009-03-04-2359,244,244,0,0,0,0,0,0,0,75,75
2009-03-04-2350,244,244,0,0,0,0,0,0,0,75,750
2009-03-04-2351,244,244,0,0,0,0,0,0,0,75,7500
2009-03-04-2352,244,244,0,0,0,0,0,0,0,75,75000

Output:
Code:
$ awk -F"," '{hr=substr($1,12,2); sum[hr]+=$NF}END{for(i in sum){print "hour", i, sum[i]}}' ffile |sort -n
hour 00 150
hour 01 130
hour 23 83325

Or am I missing something?
Login or Register to Ask a Question

Previous Thread | Next Thread

6 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Getting required fields from a test file in required fromat in unix

My data is something like shown below. date1 date2 aaa bbbb ccccc date3 date4 dddd eeeeeee ffffffffff ggggg hh I want the output like this date1date2 aaa eeeeee I serached in the forum but didn't find the exact matching solution. Please help. (7 Replies)
Discussion started by: rdhanek
7 Replies

2. Shell Programming and Scripting

help... no idea what to use

my issue now is i have a txt file containing a list like below i want to create a script that will add a constant text "Find this name" at the start and "at your directory" at the end. every line should be added by phrase at the start and end. Each line of the file should look like "Find... (4 Replies)
Discussion started by: dakid
4 Replies

3. Shell Programming and Scripting

Unix shll script for character count findings?

Hi, iam presenting the input text file format.Of this i need the character count of the number of characters present in each file.The attached file is a combination of 3 text file.each text file starts at record 1 - 34, then the next tetx file starts. What i need is the character count of each... (2 Replies)
Discussion started by: sethunath
2 Replies

4. Shell Programming and Scripting

any good idea on this?

txt file like this, 1 2 3 4456 a bb c d 3 f e 1 k 32 d m f e 123 m 2 k every line contains 3 or more columns, all the columns are separated by space, and every column includes 1 to 3 character. what I wanna do is deleting the first three columns, and keep the rest no matter how long... (7 Replies)
Discussion started by: fedora
7 Replies

5. Shell Programming and Scripting

Limitations of awk? Good idea? Bad idea?

Keeping in mind that I'm relatively comfortable with programming in general but very new to unix and korn/bourne shell scripts.. I'm using awk on a CSV file, and then performing calculations and operations on specific fields within specific records. The CSV file I'm working with has about 600... (2 Replies)
Discussion started by: yongho
2 Replies

6. Shell Programming and Scripting

An Idea for Tokenizing

One of the monitoring tools in Java is called `jps`, and it monitors all Java processes that are run by the user, an example output would be like this: 3459 Jps 2348 test 2311 Util where the first column represents Process IDs and the second column represents Java processes names.... (8 Replies)
Discussion started by: neked
8 Replies
Login or Register to Ask a Question