Hello, I have a question about python's numpy.
When calculating the correlation coefficient, you can calculate it using numpy.corrcoef
, but to verify the formula, cov/(numpy.std(X)*numpy.cov(X)*numpy.numpy.numpy.numpy.numpy.numpy).I would appreciate it if someone could tell me the cause of this
The values for each are as follows:
· numpy.corrcoef (two-dimensional array of X and Y):
array([1., -0.55847735],
[-0.55847735, 1. ]])
·numpy.cov (two-dimensional array of X and Y):
array([7.01969195e-01, -2.42092650e+01],
[-2.42092650e+01, 2.67691160e+03])
·numpy.std(X)
:
0.8375159287888337
·numpy.std(Y)
:
51.719112506809196
·numpy.cov(two-dimensional array of X and Y)/(numpy.std(X)*numpy.std(Y))
:
array([1.61935480e-02, -5.58477348e-01],
[- 5.58477348e-01, 6.17529897e+01]])
Thank you for your cooperation.
python3 numpy
np.cov()
defaults to dof=1
, while np.std()
defaults to dof=0
.
Therefore, specifying dof=1
for np.std()
yields the same results as np.corrcoef()
.
In[1]:import numpy as np
In[2]: np.random.seed(0)
...: X,Y = np.random.rand(21000)
In[3]: np.corrcoef(X,Y)
Out [3]:
array([1., 0.00601658],
[0.00601658, 1. ]])
# NG
In[4]: np.cov(X,Y)/(np.std(X)*np.std(Y))
Out [4]:
array([0.97301135, 0.0060226],
[0.0060226 , 1.0297958 ]])
# OK.
In[5]: np.cov(X,Y)/(np.std(X,dof=1)*np.std(Y,dof=1))
Out [5]:
array([0.97203834, 0.00601658],
[0.00601658, 1.028766 ]])
Just to be sure, the upper left and lower right are not 1.
because they divide X and X covariance (dispersion of X) and Y and Y covariance (dispersion of Y) by X and Y covariance of Y.
© 2024 OneMinuteCode. All rights reserved.