Visit Our UNIX and Linux User Community


Sampling and Binning- Engineering problem


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Sampling and Binning- Engineering problem
# 1  
Old 09-03-2008
Bug Sampling and Binning- Engineering problem

Hi everyone!


Can you please help me with some shell scripting?


I have an input file input.txt

It has 3 columns (Time, Event, Value)

Time event Value
03:38:22 A 57
03:38:23 A 56
03:38:24 B 24
03:38:25 C 51
03:38:26 B 7
03:38:26 B 59
03:38:27 A 98
03:38:28 A 24
03:38:29 A 35
03:38:30 A 55



Require code to do the following steps.


1) Sort by column 1 (Time) - Ascending order
2) Perform sampling every 5 second and within this 5 second block count the number event (event type ="A") and average the Value. Hence the output for the 1st 5 second block should be .


TBlocks Count_of_EVENT("A") Average Value

1 3 70.33

And continue for the 2nd 5 second block ( which starts from T= 03:38:28 ).... until end of the file .


Thanks for your support
# 2  
Old 09-04-2008
Code:
#!/usr/bin/perl

use strict;
use warnings;

my ($t1, $sum, $count, $block);

while (<>) {
  chomp;
  my ($timestamp, $event, $value) = split (/ /);
  my ($h, $m, $s) = split (/:/, $timestamp);
  my $t = $s + 60*$m + 3600*$h;

  if (! defined $t1 || $t >= $t1) {
    if (defined $t1) {
      print ++$block, " ", $count, " ", $sum/$count, "\n";
    }
   $t1 = $t + 5;
    $sum = $count = 0;
  }
  if ($event eq "A") {
    ++$count;
    $sum += $value;
  }
}

if ($count) {
  print ++$block, " ", $count, " ", $sum/$count, "\n";
}

Assumes sorted input. I'm not entirely sure I correctly figured out what to count and average but I imagine you can straighten it out if it's not completely correct.

I assume you really meant five-second blocks (for which the first ends at 03:38:27.999999) and so the output is not precisely as you specified. Maybe change the interval to six if you really want 03:38:22 through 03:38:28.999999 in the first block.

Last edited by era; 09-04-2008 at 05:24 AM.. Reason: Note five vs six second block size
# 3  
Old 09-04-2008
Sampling and Binning- Engineering problem

hi era,


I am using cygwin. can you please help me how
on how to execute this program in Cygwin.

Also how can I use the input command ( Input file = "Input.txt")

This is my first time that I using cygwin and first time to run a script.


Thank you so much for your support
# 4  
Old 09-04-2008
I'm not very familiar with Cygwin, but if you store the script in sample.pl you would simply run it with

Code:
A:\> sort -t : -n Input.txt | perl sample.pl >Output.txt

where A:\> is my possibly uninformed guess about what the Cygwin prompt looks like. (Actually I guess it's more like you@wintendo$ really.)
# 5  
Old 09-04-2008
hi era,


thanks for your quick response.

i am getting one error.

"Illegal division by zero at cp.pl line 16, <> line 2."

line 16= print ++$block, " ", $count, " ", $sum/$count, "\n";Smilie

input file:

3:13:09 B 32
3:14:01 B 51 Smilie
3:14:03 A 100
3:20:00 A 77
3:20:01 A 22
3:20:02 A 44
3:20:03 A 35
3:20:03 B 17
3:20:04 B 2
3:20:05 A 65
3:20:06 B 51
3:20:07 A 100
3:20:08 A 77
3:20:09 A 22
3:20:10 A 44
3:20:10 A 35
3:20:11 B 17
3:20:12 B 2
# 6  
Old 09-04-2008
hi era,


i found the problem.

the division by zeor error is happening
if the first row - "Event" is NOT equal to "A" , this is after sorting.

also you get an error if there are any blank lines at the end of the file.


any suggestion on how solve this.


thanks
# 7  
Old 09-05-2008
hi era,

I am trying to modify the code so I can get the count and sum for every event type in one row. This with the five second block.


Wanted output:


Timestamps--CountA--CountB--SumA---SumB


Below is the modify code ( but it is not working)Smilie

..can you please help.. Smilie


thanks



#!/usr/bin/perl

use strict;
use warnings;

my ($t1, $sumA, $countA,$sumB, $countB, $block);

while (<>) {
chomp;
my ($timestamp, $event, $value) = split (/ /);
my ($h, $m, $s) = split (/:/, $timestamp);
my $t = $s + 60*$m + 3600*$h;

if (! defined $t1 || $t >= $t1) {
if (defined $t1) {
print ++$block, " ", $countA, " ", $sumA, " ", $countB, " ", $sumB, "\n";
}
$t1 = $t + 5;
$sumA = $countA = 0;
$sumB = $countB = 0;
}
if ($event eq "A") {
++$countA;
$sumA += $value;

if ($event eq "B") {
++$countB;
$sumB += $value;
}
}
}
if ($countA,$countB) {
print ++$block, " ", $countA, " ", $sumA, " ", $countB, " ", $sumB, "\n";

Previous Thread | Next Thread
Test Your Knowledge in Computers #197
Difficulty: Easy
C# ranked higher than C according to the TIOBE Index for October 2019.
True or False?

6 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Gnuplot 3d binning

Hello I have a text file with tens of thousands of rows The format is x y where both x and y can be anything between -100 and +100. What I would like to do is have a 3d gnuplot where there are 10,000 squared or bins and each bin will count how many rows have a value that would be... (1 Reply)
Discussion started by: garethsays
1 Replies

2. Shell Programming and Scripting

problem in binning the data

hi i have some data like this input: 1 apples oranges 234 2 oranges apples 2345 3 grapes bananas 1000000 4 melons banans 10000000 5 bananas apples 5000000 6 mangoes banans 2000000 7 apples bananas 1999999 i want to put all those which are coming between 1 and 999999 in to one bin... (8 Replies)
Discussion started by: anurupa777
8 Replies

3. Shell Programming and Scripting

Sampling pcap file

Hi, I have a standard pcap file created using tcpdump. The file looks like 06:49:36.487629 IP 202.1.175.252 > 71.126.222.64: ICMP echo request, id 52765, seq 1280, length 40 06:49:36.489552 IP 192.120.148.227 > 71.126.222.64: ICMP echo request, id 512, seq 1280, length 40 06:49:36.491812 IP... (8 Replies)
Discussion started by: sajal.bhatia
8 Replies

4. Shell Programming and Scripting

Binning rows while skipping the first column

Hi I have a file that I want to bin. I am using this code: awk -F'\t' -v r=40 '{for(i=r;i<=NF;i+=r){for(j=0;j<r;j++){sum+=$(i-j)}printf "%s ", sum/r;sum=0}; printf "\n"}' file1 > file2 So basically what this code does is that it will averaging every 40 columns (creating bins of 40). But... (2 Replies)
Discussion started by: phil_heath
2 Replies

5. Shell Programming and Scripting

data sampling

I have a requirement where I have multiple flat file sources. I need to create sample data from each source. Example: Source 1 has 10 flat files-- member, transaction,item,email,....etc Now if I get any 10 records (say first 10 records) from the member flat file, I need to find those matching... (2 Replies)
Discussion started by: arrivederci
2 Replies

6. Shell Programming and Scripting

trimming and binning rows

I could not find this on the search.. I want to know how to trim a row so lets say I have a file that looks like this: bob 88888888888888 and I want to trim column 2 (lets say 4 off the front and end) bob 888888 Also, how would I bin column 2 Lets so I want to add and average... (1 Reply)
Discussion started by: phil_heath
1 Replies

Featured Tech Videos