Sampling and Binning- Engineering problem


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Sampling and Binning- Engineering problem
# 8  
Old 09-05-2008
Here's a slightly modified version which copes with empty lines and multiple event counts. It prints the event label (for legibility), sum, count, and average for each of the selected events. You can add more events like C => 1 if you want to.

Code:
#!/usr/bin/perl

use strict;
use warnings;

my (%k, $t1, %sum, %count, $block) = (A => 1, B => 1);

sub report {
  print join (",", ++$block,
    map { $_, $sum{$_} || 0, $count{$_} || 0,
      $count{$_} ? $sum{$_} / $count{$_} : "" } keys %k), "\n";
}

while (<>) {
  chomp;
  my ($timestamp, $event, $value) = split (/ /);
  next unless $timestamp;
  my ($h, $m, $s) = split (/:/, $timestamp);
  my $t = $s + 60*$m + 3600*$h;

  if (! defined $t1 || $t > $t1) {
    report if defined $t1;
    $t1 = $t + 5;
    %sum = %count = ();
  }
  if ($k{$event}) {
    ++$count{$event};
    $sum{$event} += $value;
  }
}

report if %count;

The average field prints as empty if the count and sum are zero. Here's some sample output for the input you posted.

Code:
1,A,0,0,,B,32,1,32
2,A,100,1,100,B,51,1,51
3,A,243,5,48.6,B,19,2,9.5
4,A,278,5,55.6,B,68,2,34
5,A,0,0,,B,2,1,2


Last edited by era; 09-05-2008 at 04:13 AM.. Reason: Sample output
Login or Register to Ask a Question

Previous Thread | Next Thread

6 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Gnuplot 3d binning

Hello I have a text file with tens of thousands of rows The format is x y where both x and y can be anything between -100 and +100. What I would like to do is have a 3d gnuplot where there are 10,000 squared or bins and each bin will count how many rows have a value that would be... (1 Reply)
Discussion started by: garethsays
1 Replies

2. Shell Programming and Scripting

problem in binning the data

hi i have some data like this input: 1 apples oranges 234 2 oranges apples 2345 3 grapes bananas 1000000 4 melons banans 10000000 5 bananas apples 5000000 6 mangoes banans 2000000 7 apples bananas 1999999 i want to put all those which are coming between 1 and 999999 in to one bin... (8 Replies)
Discussion started by: anurupa777
8 Replies

3. Shell Programming and Scripting

Sampling pcap file

Hi, I have a standard pcap file created using tcpdump. The file looks like 06:49:36.487629 IP 202.1.175.252 > 71.126.222.64: ICMP echo request, id 52765, seq 1280, length 40 06:49:36.489552 IP 192.120.148.227 > 71.126.222.64: ICMP echo request, id 512, seq 1280, length 40 06:49:36.491812 IP... (8 Replies)
Discussion started by: sajal.bhatia
8 Replies

4. Shell Programming and Scripting

Binning rows while skipping the first column

Hi I have a file that I want to bin. I am using this code: awk -F'\t' -v r=40 '{for(i=r;i<=NF;i+=r){for(j=0;j<r;j++){sum+=$(i-j)}printf "%s ", sum/r;sum=0}; printf "\n"}' file1 > file2 So basically what this code does is that it will averaging every 40 columns (creating bins of 40). But... (2 Replies)
Discussion started by: phil_heath
2 Replies

5. Shell Programming and Scripting

data sampling

I have a requirement where I have multiple flat file sources. I need to create sample data from each source. Example: Source 1 has 10 flat files-- member, transaction,item,email,....etc Now if I get any 10 records (say first 10 records) from the member flat file, I need to find those matching... (2 Replies)
Discussion started by: arrivederci
2 Replies

6. Shell Programming and Scripting

trimming and binning rows

I could not find this on the search.. I want to know how to trim a row so lets say I have a file that looks like this: bob 88888888888888 and I want to trim column 2 (lets say 4 off the front and end) bob 888888 Also, how would I bin column 2 Lets so I want to add and average... (1 Reply)
Discussion started by: phil_heath
1 Replies
Login or Register to Ask a Question