02-24-14 -- Group 1 -- Group Quiz

From Computational Statistics Course Wiki
Jump to navigation Jump to search


Problem 1

The distribution Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p_X(x)} :

Probability distribution for problem 1.

Problem 2

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mu = \mathbb{E}(x) = \int_{-\infty}^{\infty} x p(x) ~ dx = \int_{0}^{2} x (1 - \frac{x}{2}) ~ dx = \frac{2}{3} }

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{E}(x^2) = \int_{-\infty}^{\infty} x^2 p(x) ~ dx = \int_{0}^{2} x^2 (1 - \frac{x}{2}) ~ dx = \frac{2}{3} }

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Var(x) = \mathbb{E}(x^2) - (\mathbb{E}(x))^2 = \frac{2}{3} - \left( \frac{2}{3} \right)^2 = \frac{6}{9} - \frac{4}{9} = \frac{2}{9}}

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Std(x) = \sqrt{Var(x)} = \sqrt {\left( \frac{2}{9} \right)} \approx 0.471}

Problem 3

The CDF is the integral of the PDF.

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x - \frac{x^2}{4}, 0 < x < 2}

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle 0, x \leq 0}

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle 1, x \geq 2}

Problem 4

Code


% Function to draw random deviates from p(x) given a value drawn from
% uniform random
invCDF = @(p) (2-2*sqrt(1-p));

% Visualization
hist(invCDF(rand(100000, 1)))


Visualization

Here is an example of deviates drawn using the function above. As expected, it resembles the pdf of x.

Sanmit Quiz InvCDF.png

Problem 5

Normal distribution with mean Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mu = \frac{2}{3} N} and variance Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sigma^2 = \frac{2}{9} N} :

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S \sim \mathcal{N} \left( \mu, \sigma^2 \right) = \frac{1}{\sqrt{2 \pi \sigma^2}} exp \left( -\frac{(x - \mu)^2}{2 \sigma^2} \right) = \frac{1}{\sqrt{\frac{4 \pi}{9} N}} exp \left( {-\frac{x - \frac{2}{3} N}{\frac{4}{9} N}} \right) }

Problem 6

Code

Here is code showing pvalues calculated for 2 different datasets: one that is uniform(0,5) and the other that is p(x) from problem 1, using the null hypothesis that the data was drawn from p(x)

% Function to draw random deviates from p(x) given a value drawn from
% uniform random
invCDF = @(p) (2-2*sqrt(1-p));

n = 28;
mu = (2/3) * n;
sigma = (2/9) * n;

% Random deviates drawn from p(x) and uniform(0,5)
randDraws =  sum(5 * rand(n, 1));
pDraws = sum(invCDF(rand(n,1)));

% Pvalue for test statistic for a dataset of random deviates uniform (0,5)
if randDraws < mu    
    randPval = 2 * normcdf(randDraws, mu, sigma);   
else 
    randPval = 2 * (1 - normcdf(randDraws, mu, sigma));
end

% Pvalue for test statistic for a dataset of deviates drawn from p(x)
if pDraws < mu   
    pPval = 2 * normcdf(pDraws, mu, sigma);    
else
    pPval = 2 * (1 - normcdf(pDraws, mu, sigma));   
end

randPval
pPval

Sample Result


randPval =

   9.5479e-15


pPval =

    0.6586

In general (i.e. even over multiple runs), the null hypothesis can be rejected at the 5% significance level (possibly even a smaller significance level) for the dataset drawn from uniform(0,5). The null hypothesis is not ruled out for the dataset drawn from p(x).

Problem 7

Solution
Our approach is similar to problem 6. We perform p-value test for each of the known hypothesis, we add multiple hypothesis correction with cutoff of Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \alpha' = \alpha/10} . The hypotheses that fail the p-value test will be ruled out and we won't be able to say anything conclusively about the remaining hypotheses.

Problem 8

For each hypothesis Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle H_j} we compute the probability given the data as

  Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle  P_j = Pr(Data | H_j)\cdot P(H_j)  = \displaystyle\prod_{i=1}^{28} P_{X}^{(j)} (x_i) \cdot P(H_j) }

  Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle  arg max_{j} (P_j) }

We pick the hypothesis that maximizes the probability given the data.

Problem 9

Solution
We would multiply the characteristic functions and take the inverse fourier transform