Compute average ignoring outliers of different segments within a dat file using awk
I have data files that look like this, say data.txt
As you can see the data has various segments based on column 1. I use the following code to compute the mean of each segment and output the value of column 1 for that segment and the mean of the values of column 2 and some other things just so I can check am doing the right thing.
Unfortunately as you can see, my data has outliers in these segments. I need to remove these outliers before I compute the mean so that they don't mess up my results. I am using awk to process my data.
This is what I have been able to do so far, if I get one segment to a file say temp.txt I am able to use the following code to remove the outlier in that segment
But I need to able to do this within the code that computes the average so that my mean value excludes this outlier.
Any assistance will be highly appreciated.
Malandisa
Last edited by Scott; 09-18-2014 at 03:40 PM..
Reason: Moved from Programming forum
Hi there,
I need to split one huge file into separate files if the condition is fulfilled according to that the position between 97 and 98 matches with “IT” at the segment MAS. There is no delimiter file is fix-width with varous line length.
Could you please help me how I do split the file... (1 Reply)
Hi I want to use awk to print avg and st deviation but it does not go into a file for column 1 only.
I can do average and # of records but i cannot get st deviation.
awk '{sum+=$1} END { print "Average = ",sum/NR}'
thanks (1 Reply)
Heya there,
A small selection of my data is shown below.
DATE TIME FRAC_DAYS_SINCE_JAN1
2011-06-25 08:03:20.000 175.33564815
2011-06-25 08:03:25.000 175.33570602
2011-06-25 ... (4 Replies)
Hi All,
I need the modification for the below mentioned code (found in one more post https://www.unix.com/shell-programming-scripting/27161-script-generate-average-values.html) to find the average values for all the columns(but for a specific rows) and print the averages side by side.
I have... (4 Replies)
Hi All,
I am trying to run awk command on .DAT file and it is not working. The same command is working on .txt file:
Contents of the file ZZ_55555555_444444_ZZZZZZ_7777777_888_99.DAT:
HEADER|ZZ_55555555_444444_ZZZZZZ_7777777_888_99.DAT... (10 Replies)
Hi,
I have a file which looks like this:
FID IID MISS_PHENO N_MISS N_GENO F_MISS
12AB43131 12AB43131 N 17774 906341 0.01961
65HJ87451 65HJ87451 N 10149 906341 0.0112
43JJ21345 43JJ21345 N 2826 906341 0.003118I would... (11 Replies)
Im looking for a way to average the values in field 14 (when field 2 is equal to 2016) and fields 3 and 4 (when field 2 is equal to 2017).
Any help is appreciated.
001001 2016 33.22 38.19 48.07 51.75 59.77 67.68 70.86 72.21 66.92 53.67 42.31 40.15
001001 2017 ... (10 Replies)
The awk below executes and is close (producing the first 4 columns in desired). However, when I add the sum of $7, I get nothing returned. Basically, I am trying to combine all the matching $4 in f1 and output them with the average of $7 in each match. Thank you :).
f1
... (2 Replies)
Discussion started by: cmccabe
2 Replies
LEARN ABOUT LINUX
igawk
IGAWK(1) Utility Commands IGAWK(1)NAME
igawk - gawk with include files
SYNOPSIS
igawk [ all gawk options ] -f program-file [ -- ] file ...
igawk [ all gawk options ] [ -- ] program-text file ...
DESCRIPTION
Igawk is a simple shell script that adds the ability to have ``include files'' to gawk(1).
AWK programs for igawk are the same as for gawk, except that, in addition, you may have lines like
@include getopt.awk
in your program to include the file getopt.awk from either the current directory or one of the other directories in the search path.
OPTIONS
See gawk(1) for a full description of the AWK language and the options that gawk supports.
EXAMPLES
cat << EOF > test.awk
@include getopt.awk
BEGIN {
while (getopt(ARGC, ARGV, "am:q") != -1)
...
}
EOF
igawk -f test.awk
SEE ALSO gawk(1)
Effective AWK Programming, Edition 1.0, published by the Free Software Foundation, 1995.
AUTHOR
Arnold Robbins (arnold@skeeve.com).
Free Software Foundation Nov 3 1999 IGAWK(1)