# 16

Jump to navigation
Jump to search

1. Simulate the following: You have M=50 p-values, none actually causal, so that they are drawn from a uniform distribution. Not knowing this sad fact, you apply the Benjamini-Hochberg prescription with \alpha=0.05 and possibly call some discoveries as true. By repeated simulation, estimate the probability of thus getting N wrongly-called discoveries, for N=0, 1, 2, and 3.

Here is my code for this problem:

import numpy as np import scipy.stats as sc import matplotlib.pyplot as plt

alpha = .05 n = 50 sim = 10000 l = []

for k in range(sim): #creates the random p-values p = np.arange(0,1,1.0/n) for i in range(n): p[i] = (np.random.uniform(0.0, 1.0, 1)) p = sorted(p)

#gets positive discoveries x = np.arange(1,n+1, 1) i = 0 while p[i] < (i+1)*alpha/(n+1): i += 1 #data output if i == 0: fd[0] += 1 if i == 1: fd[1] += 1 if i == 2: fd[2] += 1 if i == 3: fd[3] += 1

print fd

**for 10^5 runs, the data is as follows:**

N = 0 --> 95258 --> P(N = 0) = 95.258% N = 1 --> 4412 --> P(N = 1) = 4.412% N = 2 --> 303 --> P(N = 2) = .303% N = 3 --> 32 --> P(N =3) = .032%

2. Does the distribution that you found in problem 1 depend on M? On \alpha? Derive its form analytically for the usual case of \alpha \ll 1?