About the correlation coefficient of numpy

Asked 2 years ago, Updated 2 years ago, 59 views

Hello, I have a question about python's numpy.
When calculating the correlation coefficient, you can calculate it using numpy.corrcoef, but to verify the formula, cov/(numpy.std(X)*numpy.cov(X)*numpy.numpy.numpy.numpy.numpy.numpy).I would appreciate it if someone could tell me the cause of this

The values for each are as follows:
· numpy.corrcoef (two-dimensional array of X and Y):

array([1., -0.55847735],
       [-0.55847735,  1.        ]])

·numpy.cov (two-dimensional array of X and Y):

:

array([7.01969195e-01, -2.42092650e+01],
       [-2.42092650e+01, 2.67691160e+03])

·numpy.std(X):

0.8375159287888337

·numpy.std(Y):

51.719112506809196

·numpy.cov(two-dimensional array of X and Y)/(numpy.std(X)*numpy.std(Y)):

array([1.61935480e-02, -5.58477348e-01],
       [- 5.58477348e-01, 6.17529897e+01]])

Thank you for your cooperation.

python3 numpy

2022-09-30 14:21

1 Answers

np.cov() defaults to dof=1, while np.std() defaults to dof=0.
Therefore, specifying dof=1 for np.std() yields the same results as np.corrcoef().

In[1]:import numpy as np

In[2]: np.random.seed(0)
   ...: X,Y = np.random.rand(21000)

In[3]: np.corrcoef(X,Y)
Out [3]:
array([1., 0.00601658],
       [0.00601658, 1.        ]])

# NG
In[4]: np.cov(X,Y)/(np.std(X)*np.std(Y))
Out [4]:
array([0.97301135, 0.0060226],
       [0.0060226 , 1.0297958 ]])

# OK.
In[5]: np.cov(X,Y)/(np.std(X,dof=1)*np.std(Y,dof=1))
Out [5]:
array([0.97203834, 0.00601658],
       [0.00601658, 1.028766  ]])

Just to be sure, the upper left and lower right are not 1. because they divide X and X covariance (dispersion of X) and Y and Y covariance (dispersion of Y) by X and Y covariance of Y.


2022-09-30 14:21

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.