Segment 10 Sanmit Narvekar

From Computational Statistics Course Wiki
Jump to navigation Jump to search

Segment 10

To Calculate

1. Take 12 random values, each uniform between 0 and 1. Add them up and subtract 6. Prove that the result is close to a random value drawn from the Normal distribution with mean zero and standard deviation 1.

Let Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X_i \sim U(0,1)} represent a random variable represent the value drawn in draw i. We are interested in proving the sum of these values - 6 is close to a random value drawn from N(0,1).

First we calculate the mean and variance of each X_i, which will be useful later when trying to determine the mean and variance of the sum - 6:

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mu_i = E[X_i] = \int_{-\infty}^{\infty} x p(x) dx = \int_0^1 x dx = \frac{x^2}{2} \Big|_0^1 = \frac{1}{2}}

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle E[X_i^2] = \int_{-\infty}^{\infty} x^2 p(x) dx = \int_0^1 x^2 dx = \frac{x^3}{3} \Big|_0^1 = \frac{1}{3} }

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sigma_i^2 = E[X_i^2] - (E[X_i])^2 = \frac{1}{3} - \frac{1}{4} = \frac{1}{12}}


And now we use these to calculate the desired quantities (making heavy use of linearity of expectation, and properties of variance for independent random variables):


Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mu = E[(\sum_{i=1}^{12} X_i ) - 6] = (\sum_{i=1}^{12} E[X_i]) - E[6] = (\sum_{i=1}^{12} \frac{1}{2}) - 6 = 6 - 6 = 0}

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sigma^2 = Var((\sum_{i=1}^{12} X_i ) - 6) = (\sum_{i=1}^{12} Var(X_i) - Var(6) = \sum_{i=1}^{12} \frac{1}{12} - 0 = 1}

Note that Var(X + Y) = Var(X) + Var(Y) only if X and Y are independent, which they are in this example.

Thus, the value is similar to being drawn from a Normal with mean 0 and variance 1.


2. Invent a family of functions, each different, that look like those in Slide 3: they all have value 1 at x = 0; they all have zero derivative at x = 0; and they generally (not necessarily monotonically) decrease to zero at large x. Now multiply 10 of them together and graph the result near the origin (i.e., reproduce what Slide 3 was sketching).


3. For what value(s) of Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \nu} does the Student distribution (Segment 8, Slide 4) have a convergent 1st and 2nd moment, but divergent 3rd and higher moments?


To Think About

1. A distribution with moments as in problem 3 above has a well-defined mean and variance. Does the CLT hold for the sum of RVs from such a distribution? If not, what goes wrong in the proof? Is the mean of the sum equal to the sum of the individual means? What about the variance of the sum? What, qualitatively, does the distribution of the sum of a bunch of them look like?


2. Give an explanation of Bessel's correction in the last expression on slide 5. If, as we see, the MAP calculation gives the factor 1/N, why would one ever want to use 1/(N-1) instead? (There are various wiki and stackoverflow pages on this. See if they make sense to you!)


Just for fun

A fun problem that ties in to 'To Calculate' 1 above and problem 6 from the Probability Blitz:


1. What is the expected number of Uniform[0,1] draws you need to add up before the sum exceeds 1? Prove your answer analytically and confirm it by simulation.


Comments