From Computational Statistics (CSE383M and CS395T)
Jump to navigation Jump to search

Skilled problem

problem 1

Simulate the following: You have M=50 p-values, none actually causal, so that they are drawn from a uniform distribution. Not knowing this sad fact, you apply the Benjamini-Hochberg prescription with <math>\alpha=0.05</math> and possibly call some discoveries as true. By repeated simulation, estimate the probability of thus getting N wrongly-called discoveries, for N=0, 1, 2, and 3.

Since it's uniform distribution for all 50 p-values, all the p-value discovered as wrong-called discoveries.

Here's the code:

import numpy as np
from collections import Counter

num = 1000000

result = [sum([1 for i,e in enumerate(sorted(case)) if e < (i+1)*0.05/50]) for case in np.random.rand(num,50)]

c = Counter(result)

print c[0]*1.0/num
print c[1]*1.0/num
print c[2]*1.0/num
print c[3]*1.0/num

The probability for N=0,1,2,3 is 0.950022, 0.046302, 0.003241, 0.000255.

problem 2

Does the distribution that you found in problem 1 depend on M? On <math>\alpha</math>? Derive its form analytically for the usual case of <math>\alpha \ll 1</math>?

The distribution depends on M, and <math> \alpha </math>

Since it's uniform, for N = 0, all uniform variable larger than <math> \frac\alpha{M} </math>, the probability is

<math> P_0 = \binom{M}{0} \cdot (1-\frac\alpha{M})^M </math>

Similarly, for N = 1,2,3,

<math> P_1 = \binom{M}{1} \cdot \frac{\alpha(1-\frac{2\alpha}{M})^{M-1}}{M}</math>

<math> P_2 = \binom{M}{2} \cdot \frac{2\alpha^2(1-\frac{3\alpha}{M})^{M-2}}{M^2}</math>

<math> P_3 = \binom{M}{3} \cdot \frac{6\alpha^3(1-\frac{4\alpha}{M})^{M-3}}{M^3} </math>

<math> P_n = \binom{M}{n} \cdot \frac{n! \cdot \alpha^n(1-\frac{(n+1)\alpha}{M})^{M-n}}{M^n} </math>

The analytical results are:

0.951205628197, 0.045327996894, 0.00212096588611, 9.74083591421e-5

Code for the above problems: code

Class activity

In team with Noah, Rcardenas,Sean Trettel.

The code is in Noah