# /Segment33

Jump to navigation Jump to search

## To Calculate

1. How many distinct m by n contingency tables are there that have exactly N total events?

[itex] {N+m*n \choose m*n-1} [/itex]

Hmm. Close, but not right. How did you get this? Note that with N=1 we should get m*n, but your answer doesn't! Wpress 15:37, 21 April 2013 (CDT)

Thanks for pointing that out Bill. It should be this: [itex] {N+m*n-1 \choose m*n-1} [/itex] Kai also helped me on that, divide the N balls by m*n-1 dividers so that we can put them in m*n boxes. But, we have to fill a ball in each box first or there will be zero balls between two of the dividers sometime. So we divide N+m*n balls, the space between them will be N+m*n-1. We will pick m*n-1 dividers among the N+m*n-1 possible spaces, thus N+m*n-1 pick m*n-1. -- Silu 10.49, 22 April 2013 (CDT)

2. For every distinct 2 by 2 contingency table containing exactly 14 elements, compute its chi-square statistic, and also its Wald statistic. Display your results as a scatter plot of one statistic versus the other.

Here's my code, I didn't want to mess up with the zero draws as I might get zero denominators for probabilities in Wald's T, so I excluded zero counts in any cell.

```
import random
import itertools
import scipy.misc as ms
import math
import numpy as np
import matplotlib.pyplot as plt

counts=range(1,14)
tablesprep=list(itertools.permutations(counts, 4))
tables=[]

for table in tablesprep:
if (sum(table) ==14):
tables.append(table)

#now I get a list of tables each have total count of 14
wald=[]
chisquares=[]
for table in tables:
p_1=float(table[0])/float(table[0]+table[2])
float(p_1)
p_2=float(table[1])/float(table[1]+table[3])
p=float(table[0]+table[1])/14
nt=p_1-p_2
dt_1=math.sqrt(p*(1-p))
dt_2=math.sqrt((1/float(table[0]+table[2])+(1/float(table[3]+table[1]))))
t=float(nt/(dt_1*dt_2))
wald.append(t)

row_1=table[0]+table[1]
row_2=table[2]+table[3]
col_1=table[0]+table[2]
col_2=table[1]+table[3]
e_11=float(row_1*col_1)/14
e_12=float(row_1*col_2)/14
e_21=float(row_2*col_1)/14
e_22=float(row_2*col_2)/14
chisquare=(table[0]-e_11)**2/e_11+(table[1]-e_12)**2/e_12+(table[2]-e_21)**2/e_21+(table[3]-e_22)**2/e_22
chisquares.append(chisquare)

plt.scatter(chisquares, wald)
plt.xlabel(r'chisquared')
plt.ylabel(r'Wald T')
plt.show()

```

# To Think About=

1. Suppose you want to find out of living under power lines causes cancer. Describe in detail how you would do this (1) as a case/control study, (2) as a longitudinal study, (3) as a snapshot study. Can you think of a way to do it as a study with all the marginals fixed (protocol 4)?

2. For an m by n contingency table, can you think of a systematic way to code "the loop over all possible contingency tables with the same marginals" in slide 8?