# Segment 23 Sanmit Narvekar

## Segment 23

#### To Calculate

1. Generate 100 i.i.d. random draws from the beta distribution $\displaystyle \text{Beta}(2.5,5.)$ , for example using MATLAB's betarnd or Python's random.betavariate. Use these to estimate this statistic of the underlying distribution: "value of the 75% percentile point minus value of the 25th percentile point". Now use statistical bootstrap to estimate the distribution of uncertainty of your estimate, for example as a histogram.

Here is the MATLAB code:

%% Question 1
nData = 100;
data = betarnd(2.5,5,[nData,1]);

% Calculate statistic
sortedData = sort(data);
statistic = sortedData(75) - sortedData(25)

% Use boostrap to estimate uncertainty
nBoots = 100000;
vals = zeros(nBoots, 1);
for iter=1:nBoots
sample = randsample(data, nData, true);
sortedSample = sort(sample);
vals(iter) = sortedSample(75) - sortedSample(25);
end

% Calculate uncertainty
hist(vals, 100)
stdDev = std(vals)


The underlying statistic was calculated to be 0.2482, with a standard deviation of 0.0288 calculated from bootstrapping. The histogram of the uncertainty is shown below:

2. Suppose instead that you can draw any number of desired samples (each 100 draws) from the distribution. How does the histogram of the desired statistic from these samples compare with the bootstrap histogram from problem 1?

I repeated the calculation above with 1 million and 1 thousand bootstraps instead of 100,000. The resulting histograms are shown below. Using more data points results in a smoother histogram, but doesn't significantly affect the uncertainty.

3. What is the actual value of the desired statistic for this beta distribution, computed numerically (that is, not by random sampling)? (Hint: I did this in Mathematica in three lines.)

This is very straightforward to compute (I did it in 2 lines in Matlab, ha):

%% Question 3
percentiles = betainv([0.75, 0.25], 2.5, 5);
trueStatistic = percentiles(1)-percentiles(2)


The actual value for the desired statistic is 0.2330