# Difference between revisions of "Eleisha's Segment 14: Bayesian Criticism of P-Values"

m |
|||

Line 66: | Line 66: | ||

3. For the experiment in the segment, what if the stopping rule was (perversely) "flip until I see five consecutive heads followed immediately by a tail, then count the total number of heads"? What would be the p-value? | 3. For the experiment in the segment, what if the stopping rule was (perversely) "flip until I see five consecutive heads followed immediately by a tail, then count the total number of heads"? What would be the p-value? | ||

− | |||

Line 73: | Line 72: | ||

1. If biology journals require p<0.05 for results to be published, does this mean that one in twenty biology results are wrong (in the sense that the uninteresting null hypothesis is actually true rather than disproved)? Why might it be worse, or better, than this? (See also the provocative paper by Ioannidis, and this blog in Technology Review (whose main source is this article). Also this news story about ESP research. You can Google for other interesting references.) | 1. If biology journals require p<0.05 for results to be published, does this mean that one in twenty biology results are wrong (in the sense that the uninteresting null hypothesis is actually true rather than disproved)? Why might it be worse, or better, than this? (See also the provocative paper by Ioannidis, and this blog in Technology Review (whose main source is this article). Also this news story about ESP research. You can Google for other interesting references.) | ||

+ | This does not mean that one in twenty biology results are false. By chance it could be that you reject the null hypothesis with p = 0.05 but it is still true. There is still that 5 percent chance that you are rejecting the null when it is actually true. This means that in an extreme case many of the published findings that reject the null may have done so incorrectly. In this case, it might actually be worse then one in twenty results being wrong. | ||

<b>Back To: </b> [[Eleisha Jackson]] | <b>Back To: </b> [[Eleisha Jackson]] |

## Latest revision as of 10:02, 30 April 2014

** To Calculate: **

1. Suppose the stopping rule is "flip exactly 10 times" and the data is that 8 out of 10 flips are heads. With what p-value can you rule out the hypothesis that the coin is fair? Is this statistically significant?

**Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \text{p - value} = \sum_{k = 8}^{10} {10\choose k} (0.5)^k(1 - p)^{10 - k} }**

For a one - sided test the p-value is: 0.0546875

For a two - sided test the p-value is: 0.109375

Here is the python code that was used to calculate the p-values:

import math, scipy.stats import numpy as np def get_p_value(k, n, p): p_value = 0.0 for i in xrange(k, n + 1): #print i binom = scipy.stats.binom.pmf(i,n,p) p_value = p_value + binom return p_value print "Stop Rule: Flip 10 times, 8 out of 10 are heads" print "P-value One-Sided = " + str(str(get_p_value(8, 10, 0.5))) print "P-value Two Sided = " + str(2*get_p_value(8, 10, 0.5))

The output produced by the code above is:

Stop Rule: Flip 10 times, 8 out of 10 are heads P-value One-Sided = 0.0546875 P-value Two Sided = 0.109375

This is not statistically significant. With a requirement that p < 0.05, this you can not rule out the hypothesis that the coin is fair.

2. Suppose that, as a Bayesian, you see 10 flips of which 8 are heads. Also suppose that your prior for the coin being fair is 0.75. What is the posterior probability that the coin is fair? (Make any other reasonable assumptions about your prior as necessary.)

**Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle H_A = \text{The coin is fair}}**

**Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle H_B = \text{The coin is unfair}}**

**Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(H_A) = 0.75}**

**Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(H_B) = 0.25}**

**Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(H_A | data) = P(data | (H_A)P(H_A)= { 10 \choose 8}*(0.75)*(0.5)^8*(0.5)^2 = 0.032959 }**

**Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(H_B | data) = P(data | (H_B)P(H_B)= { 10 \choose 8}*(0.25)*\int_0^1p^8(1- p)^2 = 0.227273 }**

If you normalize by:
**Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(H_A | data) + P(H_B | data) }**

**Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(H_A | data) = 0.591869 }**

**Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(H_B | data) = 0.408131 }**

So the posterior probability that the coin is fair is: 0.591869

Below is the Mathematica code that was used to calculate the probabilities:

3. For the experiment in the segment, what if the stopping rule was (perversely) "flip until I see five consecutive heads followed immediately by a tail, then count the total number of heads"? What would be the p-value?

** To Think About: **

1. If biology journals require p<0.05 for results to be published, does this mean that one in twenty biology results are wrong (in the sense that the uninteresting null hypothesis is actually true rather than disproved)? Why might it be worse, or better, than this? (See also the provocative paper by Ioannidis, and this blog in Technology Review (whose main source is this article). Also this news story about ESP research. You can Google for other interesting references.)

This does not mean that one in twenty biology results are false. By chance it could be that you reject the null hypothesis with p = 0.05 but it is still true. There is still that 5 percent chance that you are rejecting the null when it is actually true. This means that in an extreme case many of the published findings that reject the null may have done so incorrectly. In this case, it might actually be worse then one in twenty results being wrong.

**Back To: ** Eleisha Jackson