Trimmean of Excell in awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Trimmean of Excell in awk
# 1  
Old 08-30-2016
Trimmean of Excell in awk

Anyone know how we can implement excel trimmean avg in Unix. possible by AWK

Sample File
Code:
scr1	100
scr1	2000
scr2	320
scr1	50
scr1	10
scr2	2
scr2	220
scr1	4234
scr2	2435
scr2	2
scr3	345
scr2	356

expected Output, which was using =trimmean(scr1 values, ignore 50% of values from both end, (actually desired percentage can be vary))

Trimmean - First it will sorted the value then exclude some percentage from lowest values and highest values from the average- In this case I have ignored 50% lowest and highest value from both end.

Code:
Name Trimmean(50%) Average
scr1	1228	1685
scr3 224.5	555.83
scr3	356	345

# 2  
Old 08-30-2016
Please explain what ans how that TRIMMEAN function does.
And, some numbers in your desired output seem to be incorrect: Shouldn't the scr1 average be 1278.8, and the trimmean value of scr3 be 345 (one single value only)?
# 3  
Old 08-30-2016
Here is something to start with, feel free to correct any errors:-
Code:
sort -k2n file | awk '
        BEGIN {
                print "Name", "Trimmean(50%)", "Average"
        }
        {
                ++C[$1]                 # Count
                S[$1] += $2             # Sum
                T[$1 FS C[$1]] = $2     # Trimmean
                M[$1] = C[$1]           # Max Count
        }
        END {
                for ( k in C )
                {
                        if ( M[k] > 1 )
                        {
                                for ( i = 1+1; i <= M[k]-1; i++ )
                                {
                                        V[k] += T[k FS i]
                                }
                                printf "%s\t%.1f\t%.1f\n",  k, V[k]/(M[k]-2), S[k]/M[k]
                        }
                        else
                                printf "%s\t%.1f\t%.1f\n", k, T[k FS 1], T[k FS 1]

                }

        }
' OFS='\t'

This User Gave Thanks to Yoda For This Post:
# 4  
Old 08-30-2016
Hi RudiC

You are right, the numbers in my desired out is wrong

Here is the one:
Code:
scr1 716.66   1278.8
scr2  224.5      555.83
scr3   345   345

I am not sure I can tag the external website here:

But here is the very good example of how trimmean works:

Ignore Outliers with Excel TRIMMEAN - Contextures BlogContextures Blog

---------- Post updated at 02:15 PM ---------- Previous update was at 01:17 PM ----------

Quote:
Originally Posted by Yoda
Here is something to start with, feel free to correct any errors:-
Code:
sort -k2n file | awk '
        BEGIN {
                print "Name", "Trimmean(50%)", "Average"
        }
        {
                ++C[$1]                 # Count
                S[$1] += $2             # Sum
                T[$1 FS C[$1]] = $2     # Trimmean
                M[$1] = C[$1]           # Max Count
        }
        END {
                for ( k in C )
                {
                        if ( M[k] > 1 )
                        {
                                for ( i = 1+1; i <= M[k]-1; i++ )
                                {
                                        V[k] += T[k FS i]
                                }
                                printf "%s\t%.1f\t%.1f\n",  k, V[k]/(M[k]-2), S[k]/M[k]
                        }
                        else
                                printf "%s\t%.1f\t%.1f\n", k, T[k FS 1], T[k FS 1]

                }

        }
' OFS='\t'


Thanks

Works good. I would like to add 3 more value to be printed in the same line like
Code:
Name Trimmean  avg count min max
scr1 716.66   1278.8   5  10  4234
scr2  224.5      555.83 6 2   2435
scr3   345   345 1 345 345

Also could you please explain the code, kind of understood, but confused. Thanks.
# 5  
Old 08-30-2016
Code:
sort -k2n file | awk '
        BEGIN {
                print "Name", "Trimmean(50%)", "Average", "Count", "Min", "Max"
        }
        {
                ++C[$1]                 # Count
                S[$1] += $2             # Sum
                T[$1 FS C[$1]] = $2     # Trimmean
                M[$1] = C[$1]           # Max Count
        }
        END {
                # For every key in C array (scr*)
                for ( k in C )
                {
                        # If Max Count is greater than 1
                        if ( M[k] > 1 )
                        {
                                # Removing first (i+1) and last elements (M[k]-1) to calculate Trimmean
                                for ( i = 1+1; i <= M[k]-1; i++ )
                                {
                                        V[k] += T[k FS i]
                                }
                                printf "%s\t%.1f\t%.1f\t%d\t%d\t%d\n",  k, V[k]/(M[k]-2), S[k]/M[k], C[k], T[k FS 1], T[k FS M[k]]
                        }
                        else
                                printf "%s\t%.1f\t%.1f\t%d\t%d\t%d\n", k, T[k FS 1], T[k FS 1], C[k], T[k FS 1], T[k FS 1]

                }

        }
' OFS='\t'

These 2 Users Gave Thanks to Yoda For This Post:
# 6  
Old 08-30-2016
Maybe this will come closer to what you want:
Code:
#!/bin/ksh
percent=${1:-50}	# Percentage (0 <= percent < 100)
file=${2:-file}		# Pathname of file to process.
			# Enable debugging printouts if more than 2 operands.

sort -k1,1 -k2,2n "$file" | awk -v p="$percent" '
function calc() {
	trim = int(cnt * p / 200)
	for(i = 1 + trim; i <= cnt - trim; i++)
		trimsum += data[i]
		if(d) {	print "\tcalc(): cnt=" cnt, "trim=" trim, \
			    "trimsum=" trimsum
			for(i = 1; i <= cnt; i++)
				printf("\t\tdata[%d]=%s\n", i, data[i])
		}
	print last, trimsum / (cnt - 2 * trim), sum / cnt
}
$1 != last {
	if(cnt) {
		calc()
		cnt = sum = trimsum = 0
	} else	OFS = "\t"
	last = $1
}
{	data[++cnt] = $2
	sum += $2
}
END {	calc()
}' d=$(($# > 2))

with the sample data you provided in post #1 in a file named file and called with no operands (using the default 50% trim and default file file), it produces the output:
Code:
scr1	716.667	1278.8
scr2	224.5	555.833
scr3	345	345

If you want try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk.
# 7  
Old 08-30-2016
Thanks Don

Could you please tell me with the count, max, min
expected output:

Code:
Name Trimmean  avg count min max
scr1 716.66   1278.8   5  10  4234
scr2  224.5      555.83 6 2   2435
scr3   345   345 1 345 345

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk output yields error: awk:can't open job_name (Autosys)

Good evening, Im newbie at unix specially with awk From an scheduler program called Autosys i want to extract some data reading an inputfile that comprises jobs names, then formating the output to columns for example 1. This is the inputfile: $ more MapaRep.txt ds_extra_nikira_usuarios... (18 Replies)
Discussion started by: alexcol
18 Replies

2. Shell Programming and Scripting

Passing awk variable argument to a script which is being called inside awk

consider the script below sh /opt/hqe/hqapi1-client-5.0.0/bin/hqapi.sh alert list --host=localhost --port=7443 --user=hqadmin --password=hqadmin --secure=true >/tmp/alerts.xml awk -F'' '{for(i=1;i<=NF;i++){ if($i=="Alert id") { if(id!="") if(dt!=""){ cmd="sh someScript.sh... (2 Replies)
Discussion started by: vivek d r
2 Replies

3. Shell Programming and Scripting

PERL excell scripting

Hi, At the minute I am using a script to gather weekly SAR stats and put them into an excel scpreadsheet using perl. I then get that spreadsheet and manually add in a chart on a new worksheet, Can i add this step into the script? and if so how? here is the part of my script that creates the... (9 Replies)
Discussion started by: Bdoydie
9 Replies

4. Shell Programming and Scripting

HELP with AWK one-liner. Need to employ an If condition inside AWK to check for array variable ?

Hello experts, I'm stuck with this script for three days now. Here's what i need. I need to split a large delimited (,) file into 2 files based on the value present in the last field. Samp: Something.csv bca,adc,asdf,123,12C bca,adc,asdf,123,13C def,adc,asdf,123,12A I need this split... (6 Replies)
Discussion started by: shell_boy23
6 Replies

5. Shell Programming and Scripting

awk command to compare a file with set of files in a directory using 'awk'

Hi, I have a situation to compare one file, say file1.txt with a set of files in directory.The directory contains more than 100 files. To be more precise, the requirement is to compare the first field of file1.txt with the first field in all the files in the directory.The files in the... (10 Replies)
Discussion started by: anandek
10 Replies

6. Shell Programming and Scripting

Problem with awk awk: program limit exceeded: sprintf buffer size=1020

Hi I have many problems with a script. I have a script that formats a text file but always prints the same error when i try to execute it The code is that: { if (NF==17){ print $0 }else{ fields=NF; all=$0; while... (2 Replies)
Discussion started by: fate
2 Replies

7. Shell Programming and Scripting

scripting/awk help : awk sum output is not comming in regular format. Pls advise.

Hi Experts, I am adding a column of numbers with awk , however not getting correct output: # awk '{sum+=$1} END {print sum}' datafile 2.15291e+06 How can I getthe output like : 2152910 Thank you.. # awk '{sum+=$1} END {print sum}' datafile 2.15079e+06 (3 Replies)
Discussion started by: rveri
3 Replies

8. Shell Programming and Scripting

Awk problem: How to express the single quote(') by using awk print function

Actually I got a list of file end with *.txt I want to use the same command apply to all the *.txt Thus I try to find out the fastest way to write those same command in a script and then want to let them run automatics. For example: I got the file below: file1.txt file2.txt file3.txt... (4 Replies)
Discussion started by: patrick87
4 Replies

9. UNIX for Dummies Questions & Answers

From Ascii files to Excell

Hi, Is there anyway to copy a certain column from the Ascii file into a column on an Excel sheet? Thanks, (4 Replies)
Discussion started by: cosmologist
4 Replies

10. Shell Programming and Scripting

how to write into an Excell

Hello All, I have a query inside a shell script and it will retun a 1000 rows result set. how can i take the output into an excel file from the script for better viewing the results. Thanks, Sateesh (4 Replies)
Discussion started by: kotasateesh
4 Replies
Login or Register to Ask a Question