I have the following problem: I would like to calculate using awk a probability of appearing of a pair of numbers x and y. In other words how frequently do these numbers appear?
In the case of only one integer number x ranged for example from 1 to 100 awk one liner has the form:
where datafile contains the number x:
My question is how to extend above awk one-liner for a pair of number x and y? In this case datafiles looks as follows:
Thanks in advance.
Last edited by Franklin52; 07-20-2010 at 01:00 PM..
Reason: Please use code tags
Hi,
I am totally new to C programming on Sun Solaris environment. I am an active member on the UNIX forum and a good shell programmer.
I am trying to achieve some calculations in C programming. I have the pseudo code written down but don't know the syntax. I am reading a couple of books on C... (4 Replies)
Hallo all,
I have a script which creates an output ... see below:
root@a7germ:/tmp/pax > cat 20061117.txt
523.047
521.273
521.034
517.367
516.553
517.793
513.114
513.940
I would like to use awk to calculate the (a)total sum of the numbers (b) The average of the numbers.
Please... (4 Replies)
Dear All
How are you
I have files which look like this :
20080406_12:43:55.779 ISC Sprint- 39 21624032999 218925866728
20080406_12:44:07.811 ISC Sprint- 20 21620241815 218927736810
20080406_12:44:00.485 ISC Sprint- 50 21621910404 218913568053... (0 Replies)
I have a list of coordinate data, sampled below.
54555209 784672723
I want it as:
545552.09 7846727.23
Below is my script:
BEGIN {FS= " "; OFS= ","} {print $1*.01,$2*.01}
This is my outcome:
5.5e7 7.8e8
How do I tell awk that I want to keep all the digits instead of outputting... (1 Reply)
Hi All,
While creating the ufs file system with newfs - i where can I see the change, I mean if the density of inode has been increased where I can see it.
I tried with fstyp –v <slice> however not sure as where to look for the information.
Will appreciate if I can get... (0 Replies)
Hi All,
I have some 10 files named samp1.csv, samp2.csv,... samp10.csv
Each file having the same number of fields like,
Count, field1, field2, field3.
And a source.csv file which has three fields field1, field2, field3.
Now, i want to find the total count by taking the field1,... (8 Replies)
hi there again,
i need to do a simple division with my data with a number of rows. i think i wanted to have a simple output like this one:
col1 col2 col3
val1 val2 val1/val2
valn valm valn/valm
any suggestion is very much appreciated. thanks much. (2 Replies)
Dear All
I am having data file containing 0 to 40,000 like this...
0 5
1 65
2 159
3 356
...
...
40000 19
I want to find the probability of distribution between the numbers. The second column values are angles from 0 to 360 and the 1st column is number of files.
I am expecting... (2 Replies)
I am trying to run the awk below. My question is when I split the input, then run anotherawk to perform a calculation using that splitas the input there are no issues. When I try to combine them the output is not correct, is the split not working or did I do it wrong? Thank you :).
input
... (8 Replies)
In the below awk, I am trying to calculate percent for a given id. It is very close the problem is when the # being used in the calculation is zero. I am not sure how to code this condition into the awk as it happens frequently. The portion in italics was an attempt but that lead to an error. Thank... (13 Replies)
Discussion started by: cmccabe
13 Replies
LEARN ABOUT DEBIAN
stda
STDA(1) User Commands STDA(1)NAME
stda - Simple Tools for Data Analysis (STDA)
DESCRIPTION
STDA includes some primary tools for data analysis. You can evaluate sums, averages, integrals, derivatives, histograms or probability dis-
tribution functions of 1-d data, and eventually plot the results. The programs are stand-alone tools (supporting the standard UNIX input
and output pipelines) intended for data processing from the command line. It should be noted that all but one of the scripts use awk and
core system utilities. For plotting you have to install Gnuplot (see http://gnuplot.info) since 'muplot' is a wrapper around it. In sum-
mary, the package provides utilities for straightforward analysis of data series where a complex analytical approach is not needed and
where an ultimate numerical precision with floating-point numbers is not critical. Some general examples of application cases include eval-
uating usage statistics from server logfiles, determining a response time distribution from a series of queries to a [remote] service, pro-
ducing a plot from multiple data files, etc.
This software should be considered as an open project to be extended with new command-line driven utilities helpful for performing common
data analysis tasks. Any contributions and suggestions are welcome.
Following programs are included in the distribution:
* maphimbu - histogram builder for 1-d numerical and text data
* mintegrate - average/sum/integral/derivative of 1-d numerical data
* mmval - find minimum and maximum value in a data set
* muplot - plot a multi-curve figure from multiple data by using Gnuplot
* nnum - produce a series of equally separated integers or floats
* prefield - prepare input file for 'muplot' to plot 2-d fields by arrows
EXAMPLES
- Evaluate the current apache2 logfile and make an unique list of the hostnames (respectively ip-addresses) sorted by the total number of
their http requests:
maphimbu -rs2 /var/log/apache2/access.log
- On a X terminal plot the probability function and the cumulative distribution function of a sin(x) data sample:
nnum -3.14159 3.14159 0.00001 %.6g |awk '{ print $1, sin($1) }' | maphimbu -d0.01 -x2 -ns1 |mintegrate -d0.01 -x1 -y3-S |muplot
lp - 1:3,4
COPYRIGHT
Copyright (C) 2009, 2011-2012 Dimitar Ivanov <dimitar.ivanov@mirendom.net>
License: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.
stda 1.1.1 February 2012 STDA(1)