Calculate percentage of columns greater than certain value in a matrix using awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Calculate percentage of columns greater than certain value in a matrix using awk
# 1  
Old 04-09-2014
Calculate percentage of columns greater than certain value in a matrix using awk

This matrix represents correlation values.
Is it possible to calculate the percentage of columns (a1, a2, a3) that have a value >= |0.5| and report the percentage that has positive correlation >0.5 and negative correlation <-0.5 separately. thanx in advance!

input

Code:
name	a1	a2	a3
g1	0.8	0.4	0.2
g2	-0.2	-0.6	-0.7
g3	0.1	0.6	0.8
g4	0.1	0	0
g5	-0.2	-0.2	-0.2
g6	-0.1	-0.9	-0.9
g7	0	0	0.2


Last edited by quincyjones; 04-09-2014 at 07:15 AM..
# 2  
Old 04-09-2014
What have you tried?
# 3  
Old 04-09-2014
i am doing this manually like the following but the no of columns i have are around 5000.

Code:
awk '{ sum+=$2} END {print sum}' rm
0.5

Code:
awk '{ sum+=$3} END {print sum}' rm
-0.7

Code:
awk '{ sum+=$4} END {print sum}' rm
-0.6

Code:
(1/3)*100=33.33% (positive corr)
(2/3)*100=66.66% (negative corr)

# 4  
Old 04-09-2014
Modify the code to your desired output
Code:
$ awk ' NR > 1 { for(i=1;i<=NF;i++) arr[i]+=$i }
> END { for(i=1;i<=NF;i++) { sub("-","",arr[i]); if( arr[i] > 0.5) print "Col"i ":" arr[i] } } ' file
Col2:0.5
Col3:0.7
Col4:0.6

# 5  
Old 04-09-2014
thanx a lot. is it possible to keep the column name a1,a2, a3 along with positive or negative values ?
ex:
Code:
a1:0.5
a2:-0.7
a3:-0.6

# 6  
Old 04-09-2014
Code:
$ awk ' NR == 1 { for(i=1;i<=NF;i++) a[i]=$i }
> NR > 1 { for(i=1;i<=NF;i++) arr[i]+=$i }
> END { for(i=1;i<=NF;i++) { if( arr[i] > 0 ? arr[i] : -1 * arr[i] > 0.5) print a[i] ":" arr[i] } } ' file
a1:0.5
a2:-0.7
a3:-0.6

This User Gave Thanks to anbu23 For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Calculate percentage difference between two columns

I have a input text file in this format: ITEM1 10.9 20.1 ITEM2 11.6 12 ITEM3 14 15.7 ITEM5 20 50.6 ITEM6 25 23.6 I want to print those lines which have more than 5% difference between second and third columns. (8 Replies)
Discussion started by: ctrld
8 Replies

2. Shell Programming and Scripting

Matrix with Percentage

Hi ALL, I have below example INPUT 1 (i/p 1)|INPUT 2 (i/p 2)|OUTPUT (o/p) Bharat Bazar|Bharat Bazar|True Positive Binny's Sales|<BLANK>|False Negative <BLANK>|Binny's|False Positive <BLANK>|<BLANK>|True Negative Bharat bazar|Bharat|True Positive binny's|binny|True Positive where in... (18 Replies)
Discussion started by: nikhil jain
18 Replies

3. Shell Programming and Scripting

How to convert 2 columns into matrix -awk?

How can i convert two columns in to o and 1 matrix. thnks Input a c1 b c2 c c1 d c3 e c4 output c1 c2 c3 c4 a 1 0 0 0 b 0 1 0 0 c 1 0 0 0 d 0 0 ... (5 Replies)
Discussion started by: quincyjones
5 Replies

4. Shell Programming and Scripting

Calculate Percentage

Hello, Ive got a bunch of numbers here e.g: 6065 6094 6348 6297 6161 6377 6338 6290 How do I find out if there is a difference between 10% or more between one of these numbers ? I am trying to do this in Bash.. but no luck so far.. Does anyone have an Idea ?? Thanx, - Pascal... (9 Replies)
Discussion started by: denbekker
9 Replies

5. UNIX for Dummies Questions & Answers

Need an awk script to calculate the percentage of value field and replace

I have a input file called file.txt with the following content: john|622.5674603562933|8|br:1;cn:3;fr:1;jp:1;us:2 andy|0.0|12|**:3;br:1;ca:2;de:2;dz:1;fr:2;nl:1 in fourth filed of input file, calulate percentage of each sub filed seperated by ; semicolon and replace value with percentage . i... (11 Replies)
Discussion started by: veeruasu
11 Replies

6. Shell Programming and Scripting

Need an awk script to calculate the percentage of value field and replace

Need an awk script to calculate the percentage of value field and replace I have a input file called file.txt with the following content: john|622.5674603562933|8|br:1;cn:3;fr:1;jp:1;us:2 andy|0.0|12|**:3;br:1;ca:2;de:2;dz:1;fr:2;nl:1 in fourth filed of input file, calulate percentage of each... (1 Reply)
Discussion started by: veeruasu
1 Replies

7. Shell Programming and Scripting

AWK: calculate ratio of columns

Hi all, I have a tab-delimited text file in which i have a few columns which look like, X Y U V 2 3 4 5 4 5 3 4 6 4 3 2 For example, I want to calculate the ratio (X+Y)/(X+Y+U+V) for each row and print the output. X Y U V ... (3 Replies)
Discussion started by: mehar
3 Replies

8. Shell Programming and Scripting

Need an AWK script to calculate the percentage

Hi I need a awk script to calculate percentage. I have to pass the pararmeters in to the awk script and calculate the percentage. Sum = 50 passed = 43 failed = 7 I need to pass these value in to the awk script and calculate the percentage. Please advice me. (8 Replies)
Discussion started by: bobprabhu
8 Replies

9. Shell Programming and Scripting

How can i calculate percentage ??

i have 3 files like total.dat=18 equal.dat=14 notequal.dat=16 i need find the equal percentange means: equalpercentage = ($equal.dat / $total.dat * 100) How i can do this ? I tried some of the answers to calculate the percentage in this forums.but it couldn't worked.Some one please... (6 Replies)
Discussion started by: bobprabhu
6 Replies

10. Programming

how do I calculate percentage ?

int percent (int a, int b) { if (b/a*100 > 25) return TRUE; else return FALSE; } I want to calculate what percentage of a is b. say if b = 48, a = 100 so b is 48% of a but wouldnt b/a give me 0 ??? what can be done ?? (6 Replies)
Discussion started by: the_learner
6 Replies
Login or Register to Ask a Question