/Segment21

From Computational Statistics (CSE383M and CS395T)
Jump to navigation Jump to search

To Calculate

1. Consider a 2-dimensional multivariate normal distribution of the random variable <math>(b_1,b_2)</math> with 2-vector mean <math>(\mu_1,\mu_2)</math> and 2x2 matrix covariance <math>\Sigma</math>. What is the distribution of <math>b_1</math> given that <math>b_2</math> has the particular value <math>b_c</math>? In particular, what is the mean and standard deviation of the conditional distribution of <math>b_1</math>? (Hint, either see Wikipedia "Multivariate normal distribution" for the general case, or else just work out this special case.)


This is like slicing with a line b1=bc.

<math> \mu'=\mu_1 + \Sigma_{12} \Sigma_{22}^{-1}{(b_c-\mu_2)} </math>


<math> \Sigma' = \Sigma_{11} -\Sigma_{12} \Sigma_{22}^{-1} \Sigma_{21} </math>


2. Same, but marginalize over <math>b_2</math> instead of conditioning on it.

<math> \Sigma'=\Sigma_{11},


\mu'=\mu_1 </math>

To Think About

1. Why should it be called the Fisher Information Matrix? What does it have to do with "information"?

<math> I(b)= -E( \frac {\partial^2 logP(y_i|b)}{\partial b\partial b} ) </math>

the information is about the curvature near the maximum likelihood of b on the supporting curve. If can be calculated as:

<math> = \int \frac {\partial^2 logP(y_i|b)}{\partial b\partial b} P(y_i|b) dy_i </math>

It is the amount of information carried by an observable variable (y) about an unknown parameter b. The probability distribution of y is conditioned on this unknown b.

2. Go read (e.g., in Wikipedia or elsewhere) about the "Cramer-Rao bound" and be prepared to explain what it is, and what it has to do with the Fisher Information Matrix.

Cramer-Rao bound;

<math> Var(\hat{b}) \geq \frac 1{I(b)} </math>

So, the inverse of information matrix is the lower bound of the variance of the unbiased estimate of parameter b.

Proofs can be found here: [[1]]


on class

Teamed up with Jin and Noah

we want to minimize this chi-square:

<math> \sum_i {[\frac{T_{i(observed)}- T_0 + (T_1-T_0)\exp\left(-\frac{(xi-x_0)^2+(yi-y_0)^2}{2\lambda^2}\right)}{\sigma_T}]}^2 </math>

There are 5 parameters: To, T1, Xo, Yo, and lambda.

The estimations were respectively as following:

[  23.64998787  192.54561471   57.22606072   80.8109175    37.84544593]

We got the covariance matrix for the 5 parameters: <math> \begin{bmatrix}

{236.79825955}&{32.64277472} &{8.75985929}&{ 38.2924856 }&{ -64.37263527}\\[0.4em] {32.64277472 }&{ 1233.72273883 }&{ -46.86796286}&{ -260.16817485 }&{ -171.64918788}\\[0.4em] {8.75985929 }&{ -46.86796286 }&{ 10.96923276 }&{ 12.92003172 }&{ 3.95387078}\\[0.4em] {38.2924856 }&{ -260.16817485 }&{ 12.92003172 }&{ 83.43342126 }&{ 25.42445204}\\[0.4em] {-64.37263527 }&{ -171.64918788 }&{ 3.95387078 }&{ 25.42445204 }&{ 42.44731375}

\end{bmatrix} </math>

Then to plot the error ellipsoid with the covariance matrix, we just need the submatrix about covariance matrix of xo and yo.

We used the following Python code:

 

def plotEllipse(pos,P,edge,face):
    U, s , Vh = svd(P)
    orient = math.atan2(U[1,0],U[0,0])
    print pos
    print s
    axis([0,100,0,100])
    ellipsePlot = Ellipse(xy=pos, width=math.sqrt(s[0]),height=math.sqrt(s[1]), angle=orient,facecolor=face, edgecolor=edge)
    ax = gca()
    ax.add_patch(ellipsePlot);
    show()
    return ellipsePlot

plotEllipse(guess[2:4],cov[2:4,2:4],'r','b')


Here is our plot:

Error ellipsoid.png