Seg18. The Correlation Matrix

From Computational Statistics (CSE383M and CS395T)
Revision as of 16:37, 18 March 2013 by Jzhang (talk | contribs) (Skilled problem)
Jump to navigation Jump to search

Skilled problem

problem 1

Random points i are chosen uniformly on a circle of radius 1, and their <math>(x_i,y_i)</math> coordinates in the plane are recorded. What is the 2x2 covariance matrix of the random variables <math>X</math> and <math>Y</math>? (Hint: Transform probabilities from <math>\theta</math> to <math>x</math>. Second hint: Is there a symmetry argument that some components must be zero, or must be equal?)

The matrix would be <math> \begin{bmatrix}

                          Cov(X,X) & Cov(X,Y) \\
                          Cov(Y,X) & Cov(Y,Y) \\

Since the sample space is symmetric and the sampling is uniform, it's easy to see the mean for x,y are both 0. And the covarience of X and X is it's variance. So the matrix is actually:

<math> \begin{bmatrix}

                          <x^2> & <xy> \\
                          <yx> & <y^2> \\

And since x,y is on the circle, thus <math>x = cos\theta, y = sin\theta</math>

Thus, the matrix would be <math> \begin{bmatrix}

                                \pi & 0 \\
                                0 & \pi \\

problem 2

Points are generated in 3 dimensions by this prescription: Choose λ uniformly random in (0,1). Then a point's (x,y,z) coordinates are (αλ,βλ,γλ). What is the covariance matrix of the random variables (X,Y,Z) in terms of α,β, and γ? What is the linear correlation matrix of the same random variables?

The covariance matrix C will be (Let <math> X_1 = X, X_2 = Y, X_3 = Z</math>):

The mean for three variables are (<math>\frac{\alpha}2, \frac{\beta}2, \frac{\gamma}2 </math>)

The diagonal value will be the variance of each variable

<math> Var(X) = \int_0^1 (\alpha \lambda - \frac{\alpha}2 )^2 \cdot 1 d \lambda = \frac{\alpha^2}{12}</math>

For values that are not diagaonal,

<math> Cov(X,Y) = <(X - \bar{X}) (Y - \bar{Y})> = \alpha\beta \int_0^1 (\lambda - \frac12)^2 d\lambda = \frac{\alpha\beta}{12}</math>

So the covariance matrix is

<math> \begin{bmatrix} \frac{\alpha^2}{12} & \frac{\alpha\beta}{12} & \frac{\alpha\gamma}{12} \\ \frac{\alpha\beta}{12} & \frac{\beta^2}{12} & \frac{\beta\gamma}{12} \\ \frac{\alpha\gamma}{12} & \frac{\beta\gamma}{12} & \frac{\gamma^2}{12} \\ \end{bmatrix} </math>

since <math> r = \frac{C_{ij}}{\sqrt{C_{ii} \cdot C_{jj}}} </math>

The linear matrix will be <math> \begin{bmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \\ \end{bmatrix} </math>

Thought problem

problem 1

Suppose you want to get a feel for what a linear correlation r = 0.3 (say) looks like. How would you generate a bunch of points in the plane with this value of r? Try it. Then try for different values of r. As r increases from zero, what is the smallest value where you would subjectively say "if I know one of the variables, I pretty much know the value of the other"?

problem 2

Suppose that points in the (x,y) plane fall roughly on a 45-degree line between the points (0,0) and (10,10), but in a band of about width w (in these same units). What, roughly, is the linear correlation coefficient r?