# 16

1. Simulate the following: You have M=50 p-values, none actually causal, so that they are drawn from a uniform distribution. Not knowing this sad fact, you apply the Benjamini-Hochberg prescription with \alpha=0.05 and possibly call some discoveries as true. By repeated simulation, estimate the probability of thus getting N wrongly-called discoveries, for N=0, 1, 2, and 3.

Here is my code for this problem:

import numpy as np
import scipy.stats as sc
import matplotlib.pyplot as plt

alpha = .05
n = 50
sim = 10000
l = []

for k in range(sim):
#creates the random p-values
p = np.arange(0,1,1.0/n)
for i in range(n):
p[i] = (np.random.uniform(0.0, 1.0, 1))
p = sorted(p)

   #gets positive discoveries
x = np.arange(1,n+1, 1)
i = 0
while p[i] < (i+1)*alpha/(n+1):
i += 1

#data output
if i == 0:
fd[0] += 1
if i == 1:
fd[1] += 1
if i == 2:
fd[2] += 1
if i == 3:
fd[3] += 1


print fd

for 10^5 runs, the data is as follows:

N = 0 --> 95258 --> P(N = 0) = 95.258%
N = 1 --> 4412 --> P(N = 1) = 4.412%
N = 2 --> 303 --> P(N = 2) = .303%
N = 3 --> 32 --> P(N =3) = .032%


2. Does the distribution that you found in problem 1 depend on M? On \alpha? Derive its form analytically for the usual case of \alpha \ll 1?