# Difference between revisions of "Eleisha's Segment 23: Bootstrap Estimation of Uncertainty"

Line 2: | Line 2: | ||

1. Generate 100 i.i.d. random draws from the beta distribution <math> \text{Beta}(2.5,5.) </math>, for example using MATLAB's betarnd or Python's random.betavariate. Use these to estimate this statistic of the underlying distribution: "value of the 75% percentile point minus value of the 25th percentile point". Now use statistical bootstrap to estimate the distribution of uncertainty of your estimate, for example as a histogram. | 1. Generate 100 i.i.d. random draws from the beta distribution <math> \text{Beta}(2.5,5.) </math>, for example using MATLAB's betarnd or Python's random.betavariate. Use these to estimate this statistic of the underlying distribution: "value of the 75% percentile point minus value of the 25th percentile point". Now use statistical bootstrap to estimate the distribution of uncertainty of your estimate, for example as a histogram. | ||

+ | After generating 100 i.i.d random values, my estimated value the statistic was approximately 0.. | ||

+ | After bootstrapping (nboot = 100,000), the mean of my test statistic was approximately 0.2 and the standard deviation was approximately | ||

+ | |||

2. Suppose instead that you can draw any number of desired samples (each 100 draws) from the distribution. How does the histogram of the desired statistic from these samples compare with the bootstrap histogram from problem 1? | 2. Suppose instead that you can draw any number of desired samples (each 100 draws) from the distribution. How does the histogram of the desired statistic from these samples compare with the bootstrap histogram from problem 1? | ||

3. What is the actual value of the desired statistic for this beta distribution, computed numerically (that is, not by random sampling)? (Hint: I did this in Mathematica in three lines.) | 3. What is the actual value of the desired statistic for this beta distribution, computed numerically (that is, not by random sampling)? (Hint: I did this in Mathematica in three lines.) | ||

+ | |||

+ | Sample output | ||

+ | <pre> | ||

+ | Estimated Test Statistic: 0.249870329874 | ||

+ | Mean of values: 0.249322679305 | ||

+ | Standard Deviation of values: 0.0322945809064 | ||

+ | Mean of values: 0.231247635839 | ||

+ | Standard Deviation of values: 0.0265993072046 | ||

+ | Actual Value: 0.232952354264 | ||

+ | </pre> | ||

+ | |||

<b> To Think About </b> | <b> To Think About </b> |

## Revision as of 18:14, 6 April 2014

** To Compute: **

1. Generate 100 i.i.d. random draws from the beta distribution **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \text{Beta}(2.5,5.) }**
, for example using MATLAB's betarnd or Python's random.betavariate. Use these to estimate this statistic of the underlying distribution: "value of the 75% percentile point minus value of the 25th percentile point". Now use statistical bootstrap to estimate the distribution of uncertainty of your estimate, for example as a histogram.
After generating 100 i.i.d random values, my estimated value the statistic was approximately 0..
After bootstrapping (nboot = 100,000), the mean of my test statistic was approximately 0.2 and the standard deviation was approximately

2. Suppose instead that you can draw any number of desired samples (each 100 draws) from the distribution. How does the histogram of the desired statistic from these samples compare with the bootstrap histogram from problem 1?

3. What is the actual value of the desired statistic for this beta distribution, computed numerically (that is, not by random sampling)? (Hint: I did this in Mathematica in three lines.)

Sample output

Estimated Test Statistic: 0.249870329874 Mean of values: 0.249322679305 Standard Deviation of values: 0.0322945809064 Mean of values: 0.231247635839 Standard Deviation of values: 0.0265993072046 Actual Value: 0.232952354264

** To Think About **

1. Suppose your desired statistic (for a sample of N i.i.d. data values) was "minimum of the N values". What would the bootstrap estimate of the uncertainty look like in this case? Does this violate the bootstrap theorem? Why or why not?

2. If you knew the distribution, how would you compute the actual distribution for the statistic "minimum of N sampled values", not using random sampling in your computation?

3. For N data points, can you design a statistic so perverse (and different from one suggested in the segment) that the statistical bootstrap fails, even asymptotically as N becomes large?

** Back To: ** Eleisha Jackson