Segment 32 Sanmit Narvekar

From Computational Statistics Course Wiki
Jump to navigation Jump to search

Segment 32

To Calculate

1. 20 our of 100 U.S. Senators are women, yet when the Senate formed an intramural baseball team of 9 people only 1 woman was chosen for the team. What is the probability of this occurring by chance? What is the p-value with which the null hypothesis "there is no discrimination against women Senators" can be rejected?

The probability this would occur by chance is (from a hypergeometric distribution):

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Pr(\text{8 men, 1 woman}) = \frac{\binom{20}{1} \binom{80}{8}}{\binom{100}{9}} = 0.3048}

The pvalue with which the statement can be rejected is by testing the probability of seeing 8 men and 1 women OR all men ("something this extreme or more extreme..."). Even without calculating the value, you can see that we will fail to reject the null hypothesis at any of the standard significance levels (e.g. 5% or below), since we have already crossed the threshold.

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Pr(\text{9 men}) = \frac{\binom{80}{9}}{\binom{100}{9}} = 0.1219}

Thus, the pvalue of our dataset under the null hypothesis is 0.3048+0.1219 = 0.4267. Since this is higher than any of the usual significance levels, we cannot rule out the null hypothesis.


2. A large jelly bean jar has 20% red jelly beans, 30% blue, and 50% yellow. If 6 jelly beans are chosen at random, what is the chance of getting exactly 2 of each color? What is the name of this distribution?

This is the multinomial distribution.

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Pr(\text{2 of each color}) = \frac{N!}{n_r!n_b!n_y!} p_r^{n_r}p_b^{n_b}p_y^{n_y} = \frac{6!}{2!2!2!} (0.2)^2(0.3)^2(0.5)^2 = 0.081}


3. A small jelly bean jar has 2 red jelly beans, 3 blue, and 5 yellow. If 6 jelly beans are chosen at random, what is the chance of getting exactly 2 of each color? What is the name of this distribution?

This is the hypergeometric distribution.

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Pr (\text{2 of each color}) = \frac{\binom{2}{2} \binom{3}{2} \binom{5}{2}}{\binom{10}{6}} = \frac{1}{7}}


To Think About

1. Suppose that, in the population, 82% of people are right-handed, 18% left handed; 49% are male, 51% female; and that handedness and sex are independent. Repeatedly draw samples of N=15 individuals, form the contingency table, and apply the chi-square test for significance to get a p-value, exactly as described in the lecture segment. How often is your p-value less than 0.05? If you get an answer that is different from 0.05, why? Try larger values of N until the answer converges to 0.05. (How are you handling zero draws when they occur?)