Difference between revisions of "Eleisha's Segment 32: Contingency Tables: A First Look"

From Computational Statistics Course Wiki
Jump to navigation Jump to search
 
(10 intermediate revisions by the same user not shown)
Line 1: Line 1:
<b>To Calculate </b>
+
<b>To Calculate: </b>
  
 
1. 20 our of 100 U.S. Senators are women, yet when the Senate formed an intramural baseball team of 9 people only 1 woman was chosen for the team. What is the probability of this occurring by chance? What is the p-value with which the null hypothesis "there is no discrimination against women Senators" can be rejected?
 
1. 20 our of 100 U.S. Senators are women, yet when the Senate formed an intramural baseball team of 9 people only 1 woman was chosen for the team. What is the probability of this occurring by chance? What is the p-value with which the null hypothesis "there is no discrimination against women Senators" can be rejected?
 +
 +
<math>\text{P(Prob this occurs)} = \frac{\text{Ways to choose one female and eight males}}{\text{Total ways to choose}} = \frac{ {20 \choose 1} {80 \choose 8}}{{100 \choose 9}} \approx 0.30477. </math>
 +
 +
The null hypothesis is that there is no discrimination between men and women. The p value in this case is the probability of having no women or one woman under the null. Therefore the p-value can be calculated as:
 +
 +
<math>\text{p-value} = \frac{ {20 \choose 1} {80 \choose 8}}{{100 \choose 9}} + \frac{ {20 \choose 0} {80 \choose 9}}{{100 \choose 9}} \approx 0.42668.  </math>
 +
 +
Below is the python script that was written to perform the minor calculations:
 +
 +
<pre>
 +
from scipy import misc
 +
 +
prob_one_woman = (misc.comb(20, 1)*misc.comb(80, 8))/misc.comb(100,9)
 +
print "Probability this occurs by chance: " +  str(prob_one_woman)
 +
p_value = (misc.comb(20, 1)*misc.comb(80, 8))/misc.comb(100,9) + \
 +
(misc.comb(20, 0)*misc.comb(80, 9))/misc.comb(100,9)
 +
print "P-value with which the null hypothesis can be rejected: " +  str(p_value)
 +
</pre>
 +
 +
<b> Output: </b>
 +
<pre>
 +
Probability this occurs by chance: 0.304773971521
 +
P-value with which the null hypothesis can be rejected: 0.42668356013
 +
</pre>
 +
  
 
2. A large jelly bean jar has 20% red jelly beans, 30% blue, and 50% yellow. If 6 jelly beans are chosen at random, what is the chance of getting exactly 2 of each color? What is the name of this distribution?
 
2. A large jelly bean jar has 20% red jelly beans, 30% blue, and 50% yellow. If 6 jelly beans are chosen at random, what is the chance of getting exactly 2 of each color? What is the name of this distribution?
Line 15: Line 40:
 
3. A small jelly bean jar has 2 red jelly beans, 3 blue, and 5 yellow. If 6 jelly beans are chosen at random, what is the chance of getting exactly 2 of each color? What is the name of this distribution?
 
3. A small jelly bean jar has 2 red jelly beans, 3 blue, and 5 yellow. If 6 jelly beans are chosen at random, what is the chance of getting exactly 2 of each color? What is the name of this distribution?
  
<b>To Think About</b>
+
This is a Hypergeometrical Distribution
 +
 
 +
Want to calculate:
 +
 
 +
<math> \text{Prob(2 red, 2 blue, 2 yellow)} = \frac{\text{Number of ways to get outcome}}{\text{Total ways of choosing}} = \frac{ {2 \choose 2} {3 \choose 2} {5 \choose 2}}{{10 \choose 6}} \approx 0.1429 </math>
 +
 
 +
<b>To Think About:</b>
  
 
1. Suppose that, in the population, 82% of people are right-handed, 18% left handed; 49% are male, 51% female; and that handedness and sex are independent. Repeatedly draw samples of N=15 individuals, form the contingency table, and apply the chi-square test for significance to get a p-value, exactly as described in the lecture segment. How often is your p-value less than 0.05? If you get an answer that is different from 0.05, why? Try larger values of N until the answer converges to 0.05. (How are you handling zero draws when they occur?)
 
1. Suppose that, in the population, 82% of people are right-handed, 18% left handed; 49% are male, 51% female; and that handedness and sex are independent. Repeatedly draw samples of N=15 individuals, form the contingency table, and apply the chi-square test for significance to get a p-value, exactly as described in the lecture segment. How often is your p-value less than 0.05? If you get an answer that is different from 0.05, why? Try larger values of N until the answer converges to 0.05. (How are you handling zero draws when they occur?)
  
 
<b> Back To: </b> [[Eleisha Jackson]]
 
<b> Back To: </b> [[Eleisha Jackson]]

Latest revision as of 12:13, 13 April 2014

To Calculate:

1. 20 our of 100 U.S. Senators are women, yet when the Senate formed an intramural baseball team of 9 people only 1 woman was chosen for the team. What is the probability of this occurring by chance? What is the p-value with which the null hypothesis "there is no discrimination against women Senators" can be rejected?

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \text{P(Prob this occurs)} = \frac{\text{Ways to choose one female and eight males}}{\text{Total ways to choose}} = \frac{ {20 \choose 1} {80 \choose 8}}{{100 \choose 9}} \approx 0.30477. }

The null hypothesis is that there is no discrimination between men and women. The p value in this case is the probability of having no women or one woman under the null. Therefore the p-value can be calculated as:

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \text{p-value} = \frac{ {20 \choose 1} {80 \choose 8}}{{100 \choose 9}} + \frac{ {20 \choose 0} {80 \choose 9}}{{100 \choose 9}} \approx 0.42668. }

Below is the python script that was written to perform the minor calculations:

from scipy import misc

prob_one_woman = (misc.comb(20, 1)*misc.comb(80, 8))/misc.comb(100,9)
print "Probability this occurs by chance: " +  str(prob_one_woman)
p_value = (misc.comb(20, 1)*misc.comb(80, 8))/misc.comb(100,9) + \
				 (misc.comb(20, 0)*misc.comb(80, 9))/misc.comb(100,9)
print "P-value with which the null hypothesis can be rejected: " +  str(p_value)

Output:

Probability this occurs by chance: 0.304773971521
P-value with which the null hypothesis can be rejected: 0.42668356013


2. A large jelly bean jar has 20% red jelly beans, 30% blue, and 50% yellow. If 6 jelly beans are chosen at random, what is the chance of getting exactly 2 of each color? What is the name of this distribution?

This is a multinomial distribution.

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle N = 6, p_r = 0.2, p_b = 0.3, \text{and } p_y = 0.5 }

Want to calculate, P, where P:

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P = \text{Prob}(\text{2 red, 2 blue, 2 yellow}| N, p_r, p_b, p_y) = \frac{6!}{2!2!2!}(0.2)^2(0.3)^2(0.5)^2 = 0.081 }

3. A small jelly bean jar has 2 red jelly beans, 3 blue, and 5 yellow. If 6 jelly beans are chosen at random, what is the chance of getting exactly 2 of each color? What is the name of this distribution?

This is a Hypergeometrical Distribution

Want to calculate:

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \text{Prob(2 red, 2 blue, 2 yellow)} = \frac{\text{Number of ways to get outcome}}{\text{Total ways of choosing}} = \frac{ {2 \choose 2} {3 \choose 2} {5 \choose 2}}{{10 \choose 6}} \approx 0.1429 }

To Think About:

1. Suppose that, in the population, 82% of people are right-handed, 18% left handed; 49% are male, 51% female; and that handedness and sex are independent. Repeatedly draw samples of N=15 individuals, form the contingency table, and apply the chi-square test for significance to get a p-value, exactly as described in the lecture segment. How often is your p-value less than 0.05? If you get an answer that is different from 0.05, why? Try larger values of N until the answer converges to 0.05. (How are you handling zero draws when they occur?)

Back To: Eleisha Jackson