Help with identify gradient and the coefficient of determination of a straight line


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help with identify gradient and the coefficient of determination of a straight line
# 1  
Old 04-03-2014
Help with identify gradient and the coefficient of determination of a straight line

Hi,

Do anybody experience using awk or perl command to identify gradient of a straight line and The coefficient of determination/R-square value of a chart (R2) ?

Below is my input file :
Code:
t1 2 3 5 8
t2 0 2 0 2
t3 1 1 1 1
t4 50 70 80 90
.
.
.

Desired Output :
Code:
t1 2 3 5 8 0.6986 0.999
t2 0 2 0 2 0.1279 0.2237
t3 1 1 1 1 1 #N/A
t4 50 70 80 90 3.8813 0.9426
.
.
.

From the input file, column 1 is item that I wanna to calculate their corresponding gradient and The coefficient of determination/R-square value.
Column 2, 3, 4, 5 is the value at 0, 3, 6, 10 seconds.
Column 6 and 7 in the desired output file is gradient and The coefficient of determination/R-square value of item in column 1.

As I know that microsoft excel able to calculate the gradient of a straight line and The coefficient of determination/R-square value of a scatter lot (R2)
when we display the equation of chart and display R-square value of a chart.

Because I have a long list of item wanna to calculate the gradient and The coefficient of determination/R-square value of a chart (R2).
Thus I just curious whether anybody experience to calculate it through awk/perl command.

Thanks for any advice.
# 2  
Old 04-03-2014
I'm assuming that you want the gradient of the best fit line. Is that correct?
# 3  
Old 04-03-2014
Hi blakeoft,

Yup. You're right. I just wanna find the gradient of the best fit line. It might not necessary get exactly the same value as what I manual generated from microsoft excel.
As long as it able to generate the gradient of best fit line and R-square value are fine enough.

I have around 10k item wanna to calculate the gradient of best fit line and R-square value of each item.
I unable to manual do it one by one.
Thus hope that got other alternative way able to count it automatic.

Really thanks and appreciate your advice.
# 4  
Old 04-03-2014
This will get you the slope of the least squares line.
Code:
>awk '{sumy=$2+$3+$4+$5} {sumxy=$3*3+$4*6+$5*10} {print $1,$2,$3,$4,$5,(4.0*sumxy-19*sumy)/(4.0*145-361)}' file.txt
t1 2 3 5 8 0.611872
t2 0 2 0 2 0.127854
t3 1 1 1 1 0
t4 50 70 80 90 3.88128

I'm not sure why our first slopes differs so much, but the slope of the third line is definitely zero. This awk line only works if you have four data points for each line. Otherwise, you'll need to make some modifications. Also, I don't know what the coefficient of determination is so I need to look it up. You might be able to do it yourself if you follow this example.
This User Gave Thanks to blakeoft For This Post:
# 5  
Old 04-03-2014
Hi blakeoft,

Many thanks. I will try it out now.
Really thanks and appreciate your help.

I will find it out for the R square Smilie
# 6  
Old 04-03-2014
Sorry if this looks awful. It seems to work though.
Code:
>awk '{sumy=$2+$3+$4+$5} {sumxy=$3*3+$4*6+$5*10} {sumy2=$2*$2+$3*$3+$4*$4+$5*$5}
{denomy=(4.0*sumy2-sumy*sumy)} {slope=(4.0*sumxy-19*sumy)/(4.0*145-361)}
{if(denomy!=0) print $0,slope,(4.0*sumxy-19*sumy)*(4.0*sumxy-19*sumy)/((4.0*145-361)*denomy) ;
else print $0,slope,"#N/A"}' file.txt

Output:
Code:
t1 2 3 5 8 0.611872 0.976082
t2 0 2 0 2 0.127854 0.223744
t3 1 1 1 1 0 #N/A
t4 50 70 80 90 3.88128 0.942596

You might want to consider looking into R (I prefer to use R Studio as an interface). It's pretty easy to use and does stuff like this. Just something to think about if you have more work along the same lines.

Last edited by blakeoft; 04-03-2014 at 12:41 PM.. Reason: added the output that I got
This User Gave Thanks to blakeoft For This Post:
# 7  
Old 04-03-2014
Hi blakeoft,

Really many thanks for your assist.
The awk code looks awesome to me.

I'm really appreciate your help.
I will "digest" the meaning of your awk code.
Thanks again and a lot.

It really useful for me Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to identify empty fields in line

I am trying to use awk to identify and print out records in fields that are empty along with which line they are in. I hope the awk below is close, it runs but nothing results. Thank you :). awk awk -F'\t' 'FNR==NR ~ /^*$/ { print "NR is empty" }' file file 123 GOOD ID 45... (3 Replies)
Discussion started by: cmccabe
3 Replies

2. Shell Programming and Scripting

Regex to identify word in second position on a line

I am interested in finding a regex to find a word in second position on a line. The word in question is या I tried the following PERL EXPRESSION but it did not work: ] या or ^\W या But both gave Null results I am giving below a Sample file: देना या सौंपना=delegate तह जमना या... (8 Replies)
Discussion started by: gimley
8 Replies

3. Shell Programming and Scripting

Determination n points between two coordinates

Hi guys. Can anyone tell me how to determine points between two coardinates. For example: Which type of command line gives me 50 points between (8, -5, 7) and (2, 6, 9) points Thanks (5 Replies)
Discussion started by: rpf
5 Replies

4. Shell Programming and Scripting

correlation coefficient - Awk

Hi guys I have an input file with multiple columns and and rows. Is it possible to calculate correlation of certain value of certain No (For example x of S1 = 112) with all other values (for example start with x 112 corr a 3 of S1 = x-a 0.2 ) INPUT ******* No S1 S2 S3 S4 Sn a 3 ... (2 Replies)
Discussion started by: quincyjones
2 Replies

5. Shell Programming and Scripting

plotting a straight horizontal line

How can I plot a straight horizontal line using perl in unix solaris environment? Please suggest. Pooja (2 Replies)
Discussion started by: wadhwa.pooja
2 Replies

6. UNIX for Dummies Questions & Answers

mth code determination

FILE_DATE=`date +%Y%m%d` current mth code is 200808 How can i find the 56 mths back mth code. (1 Reply)
Discussion started by: dr46014
1 Replies

7. UNIX for Dummies Questions & Answers

Need to identify the line containing @ in between the line of a file

Hi All, I have a huge unix flat file delimted by @ at the end of the line. I need to find out if there is any line/s containing @ in between the line so that I can remove that and put the file for processing. Thanks in advance for your help. (4 Replies)
Discussion started by: b.paramanatti
4 Replies

8. UNIX for Dummies Questions & Answers

Identify duplicate words in a line using command

Hi, Let me explain the problem clearly: Let the entries in my file be: lion,tiger,bear apple,mango,orange,apple,grape unix,windows,solaris,windows,linux red,blue,green,yellow orange,maroon,pink,violet,orange,pink Can we detect the lines in which one of the words(separated by field... (8 Replies)
Discussion started by: srinivasan_85
8 Replies

9. Programming

Telnet client IP determination

I have configured my firewall to allow only five remote IP's to connect to my server. Upon connection...i would like to automate the Xsession functions for authorized IP's. Mainly, $DISPLAY of the environment to the client. I understand that the "gethostbyaddr" function is capable of this.... (0 Replies)
Discussion started by: thomas.jones
0 Replies

10. UNIX for Dummies Questions & Answers

Hardware Determination

Does anyone know an equiv of lscfg -vp for HP-UX? (2 Replies)
Discussion started by: sam_pointer
2 Replies
Login or Register to Ask a Question