Sponsored Content
Top Forums Shell Programming and Scripting Pearson correlation between two files Post 302647185 by balajesuri on Monday 28th of May 2012 01:41:03 AM
Old 05-28-2012
Code:
#! /usr/bin/perl -w
use strict;

my ($x_bar, $x_sd, $y_bar, $y_sd, $i, $numerator, $r);
my (@f1_data, @f2_data);

open F1, "< file1";
for (<F1>) {
    push (@f1_data, (split /\s+/)[2]);
}
close F1;

open F2, "< file2";
for (<F2>) {
    push (@f2_data, (split /\s+/)[2]);
}
close F2;

($x_bar, $x_sd) = avg_sd (@f1_data);
($y_bar, $y_sd) = avg_sd (@f2_data);

for ($i=0; $i<@f1_data; $i++) {
    $numerator += (($f1_data[$i] - $x_bar) * ($f2_data[$i] - $y_bar));
}

$r = $numerator / (@f1_data * $x_sd * $y_sd);
print "$r\n";

sub avg_sd {
    my ($sum, $avg, $sum_of_sq, $sd) = (0, 0, 0, 0);
    my @data = @_;
    for (@data) {
        $sum += $_;
    }
    $avg = $sum / @data;
    
    for (@data) {
        $sum_of_sq += (($_ - $avg) ** 2);
    }
    
    $sd = sqrt ($sum_of_sq / @data);
    
    return ($avg, $sd);
}

For the given two input files viz. file1 and file2, the correlation coefficient is 0.999125083532687.

By the way, if the input data are fewer in number, I'd suggest you use a scientific calculator. I was using a Casio FX 991 MS back in college Smilie I still have it. Masterpiece.
This User Gave Thanks to balajesuri For This Post:
 

8 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

chmod and cgi correlation

How much do chmod settings affect cgi scripts?? I have a "webmaster" at my work that says I cannot change the permissions on the cgi scripts, and that they work with only certain permissions. They are set for 644, I want to change them to 775 and put her in her own group, like she should be, not... (6 Replies)
Discussion started by: bigmacc
6 Replies

2. Shell Programming and Scripting

correlation coefficient - Awk

Hi guys I have an input file with multiple columns and and rows. Is it possible to calculate correlation of certain value of certain No (For example x of S1 = 112) with all other values (for example start with x 112 corr a 3 of S1 = x-a 0.2 ) INPUT ******* No S1 S2 S3 S4 Sn a 3 ... (2 Replies)
Discussion started by: quincyjones
2 Replies

3. Shell Programming and Scripting

Calculate Correlation between two fields !

Hello, I request your help with a shell script (awk) that ask for two inputs in order to calculate the correlation of the last rows between two fields ( 3 and 4). Data: EC-GLD,1/25/2011,41.270000,129.070000 EC-GLD,1/26/2011,41.550000,129.280000 EC-GLD,1/27/2011,42.260000,127.800000... (1 Reply)
Discussion started by: csierra
1 Replies

4. Shell Programming and Scripting

AWK - calculating simple correlation of rows

Is there any way to calculate a simple correlation of few selected rows with all the rows in input ? In the below example I selected Row01,02,03 and correlated with all the rows. I was trying to run in R. But the this big data matrix is too much to handle for R and eventually my system is... (3 Replies)
Discussion started by: quincyjones
3 Replies

5. Shell Programming and Scripting

awk? adjacency matrix to adjacency list / correlation matrix to list

Hi everyone I am very new at awk but think that that might be the best strategy for this. I have a matrix very similar to a correlation matrix and in practical terms I need to convert it into a list containing the values from the matrix (one value per line) with the first field of the line (row... (5 Replies)
Discussion started by: stonemonkey
5 Replies

6. Shell Programming and Scripting

3 column .csv --> correlation matrix; awk, perl?

Greetings, salutations. I have a 3 column csv file with ~13 million rows and I would like to generate a correlation matrix. Interestingly, you all previously provided a solution to the inverse of this problem. Thread title: "awk? adjacency matrix to adjacency list / correlation matrix to list"... (6 Replies)
Discussion started by: R3353
6 Replies

7. Shell Programming and Scripting

Correlation Between 3 Different Loops using Bash

I have 3 loops that I use to determine the permission level of AWS user accounts. This array lists the AWS policy ARN (Amazon Resource Name): for ((policy_index=0;policy_index<${#aws_managed_policies};++policy_index)); do aws_policy_arn="${aws_managed_policies}" ... (1 Reply)
Discussion started by: bluethundr
1 Replies

8. UNIX for Beginners Questions & Answers

Automate splitting of files , scp files as each split completes and combine files on target server

i use the split command to split a one terabyte backup file into 10 chunks of 100 GB each. The files are split one after the other. While the files is being split, I will like to scp the files one after the other as soon as the previous one completes, from server A to Server B. Then on server B ,... (2 Replies)
Discussion started by: malaika
2 Replies
mlib_SignalAutoCorrel_S16(3MLIB)			    mediaLib Library Functions				  mlib_SignalAutoCorrel_S16(3MLIB)

NAME
mlib_SignalAutoCorrel_S16, mlib_SignalAutoCorrel_S16S, mlib_SignalAutoCorrel_F32, mlib_SignalAutoCorrel_F32S - signal auto-correlation SYNOPSIS
cc [ flag... ] file... -lmlib [ library... ] #include <mlib.h> mlib_status mlib_SignalAutoCorrel_S16(mlib_d64 *correl, const mlib_s16 *src, mlib_s32 disp, mlib_s32 n); mlib_status mlib_SignalAutoCorrel_S16S(mlib_d64 *correl, const mlib_s16 *src, mlib_s32 disp, mlib_s32 n); mlib_status mlib_SignalAutoCorrel_F32(mlib_d64 *correl, const mlib_f32 *src, mlib_s32 disp, mlib_s32 n); mlib_status mlib_SignalAutoCorrel_F32S(mlib_d64 *correl, const mlib_f32 *src, mlib_s32 disp, mlib_s32 n); DESCRIPTION
Each of these functions performs auto-correlation. For monaural signals, the following equation is used: 1 n-d-1 correl[0] = ----- * SUM (src[i] * src[i + d]) n - d i=0 For stereo signals, the following equation is used: 1 n-d-1 correl[0] = ----- * SUM (src[2*i] * src[2*(i + d)]) n - d i=0 1 n-d-1 correl[1] = ----- * SUM (src[2*i + 1] * src[2*(i + d) + 1]) n - d i=0 where d = disp. PARAMETERS
Each of the functions takes the following arguments: correl Pointer to the auto-correlation array. In the stereo version, correl[0] contains the auto-correlation of channel 0, and correl[1] contains the auto-correlation of channel 1. src Source signal array. disp Displacement. 0 <= disp < n. n Number of samples in the source signal array. RETURN VALUES
Each of the functions returns MLIB_SUCCESS if successful. Otherwise it returns MLIB_FAILURE. ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Interface Stability |Evolving | +-----------------------------+-----------------------------+ |MT-Level |MT-Safe | +-----------------------------+-----------------------------+ SEE ALSO
mlib_SignalCrossCorrel_S16(3MLIB), attributes(5) SunOS 5.10 10 Nov 2004 mlib_SignalAutoCorrel_S16(3MLIB)
All times are GMT -4. The time now is 02:36 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy