Difference between revisions of "Eleisha's Segment 12: P-Value Tests"

From Computational Statistics Course Wiki
Jump to navigation Jump to search
Line 31: Line 31:
 
1. P-value tests require an initial choice of a test statistic. What goes wrong if you choose a poor test statistic? What would make it poor?
 
1. P-value tests require an initial choice of a test statistic. What goes wrong if you choose a poor test statistic? What would make it poor?
  
A poor test statistic would be one where the associated distribution can not be compared with the  
+
A poor test statistic would be one where the distribution of the  
  
 
2. If the null hypothesis is that a coin is fair, and you record the results of N flips, what is a good test statistic? Are there any other possible test statistics?
 
2. If the null hypothesis is that a coin is fair, and you record the results of N flips, what is a good test statistic? Are there any other possible test statistics?
  
A good test statistic would be the one that is binomially distributed with the parameter p = 0.5.
+
A good test statistic would be the one that is binomially distributed with the parameter p = 0.5, under the null.  
 
   
 
   
 
3. Why is it so hard for a Bayesian to do something as simple as, given some data, disproving a null hypothesis? Can't she just compute a Bayes odds ratio, P(null hypothesis is true)/P(null hypothesis is false) and derive a probability that the null hypothesis is true?
 
3. Why is it so hard for a Bayesian to do something as simple as, given some data, disproving a null hypothesis? Can't she just compute a Bayes odds ratio, P(null hypothesis is true)/P(null hypothesis is false) and derive a probability that the null hypothesis is true?

Revision as of 14:22, 20 April 2014

To Calculate:

1. What is the critical region for a 5% two-sided test if, under the null hypothesis, the test statistic is distributed as Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \text{Student}(0,\sigma,4) } ? That is, what values of the test statistic disprove the null hypothesis with p < 0.05? (OK to use Python, MATLAB, or Mathematica.)

Let t = the value of the test statistic If the test statistic is distributed as Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \text{Student}(0,\sigma,4) } , then for a two sided test the critical region is when Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle |t| >2.77645 \sigma } . This can calculated by taking the inverse of the CDF of the probability distribution and and evaluating it at (1 - 0.05/2).

2. For an exponentially distributed test statistic with mean Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mu } (under the null hypothesis), when is the the null hypothesis disproved with p < 0.01 for a one-sided test? for a two-sided test?

Let t = the value of the test statistic

The pdf for an exponentially distributed test statistic with parameter Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \lambda } is:

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p(x) = \lambda e^{- \lambda x} }

Since the mean of p(x) is Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \frac{1}{\lambda} } , we take Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \lambda = \frac{1}{\mu} }

We can solve for the critical region in a similar manner to question one by determining the inverse of the CDF of p(x) and evaluating it a (1 - 0.01) for a one sided test and (1 - 0.01/2).

For a one- sided test the null hypothesis is disproved with p< 0.01 when Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle t> 4.60517 \mu } .

For a two - sided test the null hypothesis is disproved with p< 0.01 when Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle |t|> 5.29832 \mu } .

Below is the mathematica code that I used to solve for the critical regions:

Eleisha math 12.png


To Think About:

1. P-value tests require an initial choice of a test statistic. What goes wrong if you choose a poor test statistic? What would make it poor?

A poor test statistic would be one where the distribution of the

2. If the null hypothesis is that a coin is fair, and you record the results of N flips, what is a good test statistic? Are there any other possible test statistics?

A good test statistic would be the one that is binomially distributed with the parameter p = 0.5, under the null.

3. Why is it so hard for a Bayesian to do something as simple as, given some data, disproving a null hypothesis? Can't she just compute a Bayes odds ratio, P(null hypothesis is true)/P(null hypothesis is false) and derive a probability that the null hypothesis is true?

Even with the calculation of the Bayes odds ratio one can not definitely say that the null hypothesis is false. If the Bayesian finds EME hypothesis associated with the data and calculates ratios, this still is not a rejection of the null hypothesis.

Back To: Eleisha Jackson