Correlation Coefficients
The general formulae to compute correlation coefficient between 2 variables is -
where cov(A,B) is the covariance between A & B and SA and SB are the standard deviations.
Manual Way in R
#Define 2 verctors
>
> A <- c(1,2,4,5)
> B <- c(5,6,8,1)
>
> #Finding Covariance between A & B
> A_diff <- A - mean(A)
> B_diff <- B - mean(B)
>
> #Print both the variables created above
> A_diff
[1] -2 -1 1 2
> B_diff
[1] 0 1 3 -4
>
> #Do the summation and divide by N-1 to get the covariance between the two vectors
> #N = 3 in this case
> cov <- sum(A_diff*B_diff)/(3-1)
>
> #Finding the squared difference w.r.t to mean for the vectors
> A_sq <- A_diff^2
> B_sq <- B_diff^2
>
> #Using the standard deviation formulae
>
> A_sd <- sqrt(sum(A_sq)/(3-1))
>
> B_sd <- sqrt(sum(B_sq)/(3-1))
>
> #Print the standard deviation
> A_sd
[1] 2.236068
> B_sd
[1] 3.605551
>
> #Plugging in values to find the correlation coefficient
>
> corr <- cov/(A_sd*B_sd)
>
> #Printing the correlation obtained - Manual way
> corr
[1] -0.3721042
>
> #Using formulae for direct computation
>
> corr_test <- cor(A,B)
> corr_test
[1] -0.3721042
Using the in-built function and manual way, we get the same result.
No comments:
Post a Comment