Stats do not match, can't work out why this is happening. Any help appreciated


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Stats do not match, can't work out why this is happening. Any help appreciated
# 1  
Old 03-16-2012
Stats do not match, can't work out why this is happening. Any help appreciated

Hi,

Ive got a problem that I cant work out why is happening, any help appreciated:

Useng gawk to create statistics (stats.txt) from a huge csv file (output.txt) Small sample attached.

FYI: The contents of the csv file are from a datalogger counting rare bats exiting a bat roost. Each transit causes a parallel pair of IR beams (start state zero) to log a one and then return to zero. The direction of travel is determined by the sequence of beams broken: 1-2 or 2-1

Thus, exit (out) flight from roost-
beam one: 1,0 beam two: 1,0

Return flight (in) into roost-
beam two:1,0 beam one: 1,0

The code creates a stack since bats go in and out all the time, we only need to count the maximum out at any one time (thus we can say there are at least x bats in the roost).

The problem I am having is that the stats do not add up properly at the end. At the end of the bats day (we start each day with zero bats) when you subtract the total "in" flights from the total "out" flights it does not equal the number of bats caclulated to be still out.

e.g.
Date,2011-06-05,IN>,189,OUT>,442,Max count>,286,difference between start of count (0) and end of count = -253 (Max was 22:30:50)

You can see the max count is 286 yet if you subtract the ins 189 from the outs 442 this equals 253, not the reported count of 286. I am at a loss to work out what is going on here.Smilie

The script I'm having problems with is batmon02-02-2012.awk

here is the code: (it is run in windoze). Also attached is a small output.txt file of only a few hours data.

ipether.bat (calls the following scripts in sequence)
Code:
@echo off
cls
echo There cannot be any extra linefeeds at the start 
ECHO or end of the input file (output.txt)
echo.
echo It must be in the expected format, any other will cause crash.
ECHO if in doubt, search for two commas together and delete any lines like this.
echo.
echo working... please wait...
 
@echo off
.\zsh.exe .\IpEther1.zsh
echo ====================
rem type stats.txt
echo done!
PAUSE
rem stats.txt > notepad
exit

IpEther1.zsh
Code:
# because windows CMD doesn't actually expand * into arguments for you,
# we need to run this in sh to do so.  
exec ./zsh.exe IpEther2.zsh * > events.txt 2> stats.txt

IpEther2.zsh
Code:
#  calls gawk using output.txt as the input to awk
./gawk.exe -F "," -f batmon02-02-2012.awk < output.txt

batmon02-02-2012.awk
Code:
#this  makes use of csv format file "output.txt" 
# MUST compensate for positive and negative totals
# logger day starts at 4pm actual time which = midnight bat time (since this script operates
# midnight to midnight) thus 16 hours are added
BEGIN { OT=0 # Time of previous measurement
        MAX=0 # Max num of seconds between valid events
        DAY=""; # Current day
        CA=0 ; CB=0 ; CX=0 ; CD=0 # var to "hold" in or "out" A, C are one direction, B, X are the other dir
        # Running total of bats leaving and entering
        TOTALBATS=0;
        MAXBATS=0;    # The highest TOTALBATS has ever been
        maxtime=0; # time of peak event
        # XX = highest negative vlaue of bats   -  converted to positive)
        XX=0;  
        XXX=0;
        # Length of the patterns
        L=4
        # Patterns to check against
        # Block 1 unBlock 1 block 0 Unblock 0
        A[0]="1,1"; A[1]="1,0"; A[2]="0,1"; A[3]="0,0";
        # Block 1 Block 0 unblock 1 Unblock 0
        X[0]="1,1"; X[1]="0,1"; X[2]="1,0"; X[3]="0,0";
        # Block 0 Unblock 0 Block 1 Unblock 1
        B[0]="0,1"; B[1]="0,0"; B[2]="1,1"; B[3]="1,0";
        # Block 0 block 1 unBlock 0 unblock 1
        D[0]="0,1"; D[1]="1,1"; D[2]="0,0"; D[3]="1,0";
}
# ========================================================
        function print_daily(day,inbt,outbt,maxbt,maxtime)
        {
        I=total; if(I<0) I=-I;
        MX="no maximum"
        if(maxtime > 0)
        MX=sprintf("Max was %s", strftime("%H:%M:%S",maxtime));  #strftime(format, timestamp)
        CZ=(CA-CB); # GET ERROR
 
        printf("Date,%s,IN>,%d,OUT>,%d,Max count>,%d,difference between start of count (0) and end of count = %d (%s)\n",
        day, inbt, outbt, maxbt, CZ, MX) > "/dev/stderr";
       # printf("==") > "/dev/stderr";
        # Reset daily counts
        TOTALBATS=0; MAXBATS=0;  maxtime=0;
         CA=0; CB=0;  CZ=0;
        }
        # end function
        #====================================================================
        { # Calculate timestamp from date string
        T=mktime($1 " " $2 " " $3 " " $5 " " $6 " " $7);
    #  T+=(60*60*16); # Add sixteen hours
        $1=strftime("%Y", T); # Put these back in the strings
        $2=strftime("%m", T);
        $3=strftime("%d", T);
        $5=strftime("%H", T);
        $6=strftime("%M", T);
        $7=strftime("%S", T);
        #    print ($5 " " $6 " " $7) " MAXBATS "MAXBATS  > "/dev/stderr"; #cp
 
        # When the year, month, and/or day changes, time to print daily counts
        if((DAY != $1 "-" $2 "-" $3) && (DAY != ""))
                print_daily(DAY,CA,CB,MAXBATS,maxtime);  #CP
                DAY=$1 "-" $2 "-" $3;
 
        if($8 == "pv") # Ignore anything but PV lines.
        {
#===============================================================
  # If too much time has passed since the last event, start over.
   if((T-OT) > MAX) # Blank the array   
     for(N=0; N<(L-1); N++) C[N]="";
              else # Shift elements toward the front
                for(N=0; N<(L-1); N++) C[N]=C[N+1];
            OT=T # Set prev time to this one.
    C[L-1]=$9 "," $10; # Set the latest event in the array
 
    # Search for events in the array.
    FOUNDA=1; FOUNDB=1;
    FOUNDX=1; FOUNDD=1;
    for(N=0; N<L; N++)
    {
    if(A[N] != C[N]) FOUNDA=0;
    if(B[N] != C[N]) FOUNDB=0;
    if(X[N] != C[N]) FOUNDX=0;
    if(D[N] != C[N]) FOUNDD=0;
    }
 
    # Count the events and mark the hour they occurred in
    if(FOUNDA || FOUNDX)
    {
     CA++;
      printf("A@%s-%s-%s %s:%s:%s\n",$1,$2,$3,$5,$6,$7);
     AH[$5]++;
     TOTALBATS--;
    # print($5,$6,$7) > "/dev/stderr";    #cp
    }
    if(FOUNDB || FOUNDD)
     {
      CB++;
       printf("B@%s-%s-%s %s:%s:%s\n",$1,$2,$3,$5,$6,$7);
      BH[$5]++;
      TOTALBATS++;
    }
 
   XXX=TOTALBATS;
  if (TOTALBATS<0) #cp     NEXT LINES ARE USED TO GET THE ABSOLUTE VALUE OF TOTALBATS
 {  XXX=TOTALBATS*TOTALBATS; #cp   MULTIPLY NEGATIVE BY NEGATIVE TO GET POSITIVE
  XXX=sqrt(XXX);    #cp      SQUT IS INBUILT SQUARE ROOT FUNCTION: BACK TO # ORIGINAL NUMBER, NOW POSITIVE
 }
 
 # Update our maximum daily counts
   if(MAXBATS < XXX)
    {
      MAXBATS=XXX;
      maxtime=(T+=(60*60*16));  # +16 HOURS TO SHOW MAX ACTUAL TIME, NOT LOGGER TIME
                               # i.e. midnight is actually 4pm (thus plus 16 hours)
                              }
 
 print "TOTALBATS "TOTALBATS " MAXBATS "MAXBATS" maxtime "strftime("%m/%d %H:%M:%S", maxtime)" time " strftime("%m/%d %H:%M:%S", T)> "/dev/stderr"; #cp
 
}
}
END {
    # The final statistics will be printed to stderr, to easily
    # seperate them from the event times printed to stdout.
 
    # The last daily count
    print_daily(DAY,CA,CB,MAXBATS,maxtime);
    # Print the event counts
 ###   printf("A %2d\nB %2d\nT %2d\n", CA, CB, CA+CB) > "/dev/stderr";
   # printf("A %2d\nB %2d\nX %2d\nD %2d\nT %2d\n", CA, CB, CX, CD, CA+CB+CX+CD) >             "/dev/stderr";
    # Print a list of hours from 1-23
###    STR="H";
###    for(N=1; N<=23; N++) STR=STR sprintf(" %2d", N);;
###    print STR > "/dev/stderr";
    # Print hourly counts for event A
 ###   STR="A";
 ###   for(N=1; N<=23; N++)
###    STR=STR sprintf(" %2d", AH[sprintf("%02d", N)]);
###    print STR > "/dev/stderr";
    # Hourly counts for event B
###    STR="B";
###    for(N=1; N<=23; N++)
 ###   STR=STR sprintf(" %2d", BH[sprintf("%02d",N)]);
 ###   print STR > "/dev/stderr";
  }

Thanks
cmp

This is hisory of how I got started:
https://www.unix.com/shell-programmin...ew-logger.html

Last edited by cmp260; 03-16-2012 at 10:26 AM.. Reason: more info
# 2  
Old 03-16-2012
With your provided input I get:
Code:
[..]
TOTALBATS 195 MAXBATS 195 maxtime 06/05 22:04:25 time 06/05 06:04:48
TOTALBATS 195 MAXBATS 195 maxtime 06/05 22:04:25 time 06/05 06:04:48
TOTALBATS 195 MAXBATS 195 maxtime 06/05 22:04:25 time 06/05 06:04:48
B@2011-06-05 06:04:48
TOTALBATS 196 MAXBATS 196 maxtime 06/05 22:04:48 time 06/05 22:04:48
TOTALBATS 196 MAXBATS 196 maxtime 06/05 22:04:48 time 06/05 06:04:51
Date,2011-06-05,IN>,119,OUT>,315,Max count>,196,difference between start of count (0) and end of count = -196 (Max was 22:04:48)

Some timestamps seem to be converted from 06 to 22 or from 05 to 21? Is that correct?
# 3  
Old 03-16-2012
time conversion

Quote:
Some timestamps seem to be converted from 06 to 22 or from 05 to 21? Is that correct?
yes, since the bats "day" starts at 4pm 1600, the logger is set to 16 hours behind. The final calculations add 16 hours to compensate. Normally the only lines shown are the daily counts. This is showing the workings to help debug. comment out this line for only daily counts:
Code:
print "TOTALBATS "TOTALBATS " MAXBATS "MAXBATS" maxtime "strftime("%m/%d %H:%M:%S", maxtime)" time " strftime("%m/%d %H:%M:%S", T)> "/dev/stderr"; #cp

I can provide an entire run of output.txt covering almost a year if you want.
# 4  
Old 03-16-2012
OK, your "output.txt" end like this:
Code:
2011,06,05,Rx,06,04,48,pv,1,0
2011,06,05,Rx,06,04,51,pv,0,1
2011,0

Is that correct, I take it the last 6 characters are to be deleted?
With the given file what is the expected output?
# 5  
Old 03-16-2012
sorry it got truncated, attached now

hi, sorry it got truncated copying between virtual environment and real.
attached correct stats.txt and output.txt now
# 6  
Old 03-16-2012
yes, for inital file I supplied, the last 6 characters are to be deleted. There can be no blank lines within the file or above or below the text.

The expected output is:
Date,2011-06-05,IN>,189,OUT>,442,Max count>,286,difference between start of count (0) and end of count = -253 (Max was 22:30:50)

(This is with the debugging commented out)
# 7  
Old 03-16-2012
Hi,

TOTALBATS 286 MAXBATS 286 maxtime 06/05 22:30:50 time 06/05 22:30:50

at 22:30:50 it was 140 in 426 out which makes a difference of 286.

In the end summary the value of the last total of 22:39:07 is being used, which is 442 - 189.
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need help on unix scripting. I am at hello world level. Your help would be greatly appreciated.

Need to automate the following steps (OS--> AIX ) 1. cd /advx/R8.1MR2/TOP/logs ( I navigate to this directory) 2. Monitor the current date ULOG for specific error pattern => ls -lrt ULOG.08* -rw-rw-rw- 1 vssrt vssgrp 24370 Aug 01 23:57 ULOG.080112 -rw-rw-rw- 1 vssrt vssgrp ... (6 Replies)
Discussion started by: krchakr
6 Replies

2. Homework & Coursework Questions

Help would be appreciated

If anyone could possibly help me to do this following question? Print a long listing of the files in the login directory for a specific month. The user is prompted to enter the first 3 letters of a month name, starting with a capital, and the program will display a long list of all files that... (1 Reply)
Discussion started by: Zunifx
1 Replies

3. Fedora

Leap second happening

Have anybody heard about the Leap second problem Leap second :A leap second is a one-second adjustment that is occasionally applied to Coordinated Universal Time (UTC) in order to keep its time of day close to the mean solar time. How could i avoid such thing in my script which i deal with... (6 Replies)
Discussion started by: wnaguib
6 Replies

4. UNIX for Dummies Questions & Answers

Had this idea for years, any help at all would be greatly appreciated!

Hello everyone, For a while now ive been wanting to create a way for me to access my home pc and have it perform remote tasks (Namely file downloading, archiving, torrenting) without needing to keep the PC continuously powered. Having seen the power of Linux and ssh at the hands of... (1 Reply)
Discussion started by: spliffinz
1 Replies

5. Shell Programming and Scripting

sed command not quite working yet, if anyone can help appreciated

Hi gang, I am trying to create some batch commands for many html pages I need to re-format. I am trying the number 2b in this example to wrap anchor tags around the number that will be referenced in the footnotes. I am trying to use the h/H hold command, but I have never tried using it... (2 Replies)
Discussion started by: naphelge
2 Replies

6. Shell Programming and Scripting

Incrementing a variable is not happening

Hi All, Iam trying to increment a variable Following is the code #!/usr/bin/ksh i=1; i='expr $i+1'; echo $i; Output: expr $i+1 not able to understand why its happening in that way i was expecting result as 2... if the above method is worng .. can you help how i can get... (3 Replies)
Discussion started by: kiranlalka
3 Replies

7. UNIX for Dummies Questions & Answers

what's happening with my keyboard

hi everybody. i ussually use unix and windows, but mainly unix-mandriva distribution, and i have a problem. i have like main os unix, and windows as secondary, and this one is loaded by vmware application. well, when a i load vmware to execute windows afterwards when i return to unix, in this... (1 Reply)
Discussion started by: tonet
1 Replies

8. AIX

Ping/Telnet is not happening

Hi All, We are not able to ping to a AIX box...Network is ok..when we give ping from that AIX box..it is giving 0821-067 ping: The socket creation call failed.there is no enough buffer space for the requested socket operation. refresh -s inetd is also giving socket error. Please help to... (1 Reply)
Discussion started by: b_manu78
1 Replies

9. AIX

Ping is happening, telnet is not happening

HI all, Ping is happening to a AIX box...but telnet is not happening... AIX box doesn't have any conslole... Please help how to resolve it. Thanks in advance .. Manu (2 Replies)
Discussion started by: b_manu78
2 Replies
Login or Register to Ask a Question