Segment 22 Sanmit Narvekar

From Computational Statistics Course Wiki
Jump to navigation Jump to search

Segment 22

To Compute

1. In lecture slide 3, suppose (for some perverse reason) we were interested in a quantity Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f = b_3/b_5} instead of Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f = b_3b_5} . Calculate a numerical estimate of this new Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f} and its standard error.

The mean is simply the function evaluated at the fitted parameters:

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \langle f \rangle = \frac{b_3}{b_5} = \frac{0.6582}{1.4832} = 0.4438}

The variance can be calculated using the following formula:

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \text{Var}(f) = \nabla f \Sigma \nabla f^T}

The gradient of f is:

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \nabla f = \left[0, 0, \frac{1}{b_5}, 0, - \frac{b_3}{b_5^2} \right]}

By substituting this into the formula above and using the covariance matrix in the slides, we get the variance of f to be:

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \text{Var}(f) = \frac{1}{b_5^2}\Sigma_{33} + 2 \frac{1}{b_5} \left(- \frac{b_3}{b_5^2} \right) \Sigma_{35} + \frac{b_3^2}{b_5^4} \Sigma_{55}}

Plugging in the numbers and crunching gives a variance of 0.0145 and a standard deviation of 0.12044.


2. Same set up, but plot a histogram of the distribution of Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f} by sampling from its posterior distribution (using Python, MATLAB, or any other platform).

Here is the MATLAB code:


% Mean and cov of b's
bfit = [1.1235 1.5210 0.6582 3.2654 1.4832];

covar = [0.1349 0.2224 0.0068 -0.0309 0.0135;
0.2224 0.6918 0.0052 -0.1598 0.1585; 
0.0068 0.0052 0.0049 0.0016 -0.0094;
-0.0309 -0.1598 0.0016 0.0746 -0.0444;
0.0135 0.1585 -0.0094 -0.0444 0.0948];

% Sample lots of b's
bs = mvnrnd(bfit, covar, 10000);

% Calculate f for each of those b's
fb = bs(:,3) ./ bs(:,5);

% Plot a histogram
hist(fb, 30)

% Report empirical mean and std dev
mean(fb)
std(fb)

This resulted in the following mean and standard deviation:


mu =

    0.4720


stdDev =

    0.1492

And here is the corresponding histogram of the posterior distribution of the parameter:

Sanmit seg22.png

To Think About

1. Lecture slide 2 asserts that a function of normally distributed RVs is not, in general, normal. Consider the product of two independent normals. Is it normal? No! But isn't the product of two normal distribution functions (Gaussians) itself Gaussian? So, what is going on?

2. Can you invent a function of a single normal N(0,1) random variable whose distribution has two separate peaks (maxima)? How about three? How about ten?