# Nick Wilson

My favorite segments and class activities have 1-3 stars (***) -- the more the better -- at the end of their titles.

Many of the plots from my experiments are in the right margin of this page. Click on the images for a larger version or go to the segment for full details and code.

## Contents

- 1 Segments
- 1.1 Segment 1 - Let's Talk about Probability
- 1.2 Segment 2 - Bayes
- 1.3 Segment 3 - Monty Hall
- 1.4 Segment 4 - The Jailer's Tip
- 1.5 Segment 5 - Bernoulli Trials
- 1.6 Segment 6 - The Towne Family Tree
- 1.7 Segment 7 - Central Tendency and Moments
- 1.8 Segment 8 - Some Standard Distributions (*)
- 1.9 Segment 9 - Characteristic Functions
- 1.10 Segment 10 - The Central Limit Theorem (*)
- 1.11 Segment 11 - Random Deviates
- 1.12 Segment 12 - P-Value Tests
- 1.13 Segment 13 - The Yeast Genome
- 1.14 Segment 14 - Bayesian Criticism of P-Values
- 1.15 Segment 16 - Multiple Hypotheses
- 1.16 Segment 15 - The Towne Family - Again
- 1.17 Segment 17 - The Multivariate Normal Distribution
- 1.18 Segment 18 - The Correlation Matrix
- 1.19 Segment 19 - The Chi Square Statistic
- 1.20 Segment 20 - Nonlinear Least Squares Fitting
- 1.21 Segment 21 - Marginalize or Condition Uninteresting Fitted Parameters
- 1.22 Segment 22 - Uncertainty of Derived Parameters
- 1.23 Segment 23 - Bootstrap Estimation of Uncertainty
- 1.24 Segment 24 - Goodness of Fit
- 1.25 Segment 27 - Mixture Models
- 1.26 Segment 28 - Gaussian Mixture Models in 1-D (**)
- 1.27 Segment 29 - GMMs in N-Dimensions
- 1.28 Segment 30 - Expectation Maximization (EM) Methods
- 1.29 Segment 31 - A Tale of Model Selection
- 1.30 Segment 32 - Contingency Tables: A First Look
- 1.31 Segment 33 - Contingency Table Protocols and Exact Fisher Test (*)
- 1.32 Segment 34 - Permutation Tests
- 1.33 Segment 37 - A Few Bits of Information Theory
- 1.34 Segment 38 - Mutual Information
- 1.35 Segment 39 - MCMC and Gibbs Sampling (**)
- 1.36 Segment 40 - Markov Chain Monte Carlo, Example 1 (***)
- 1.37 Segment 41 - Markov Chain Monte Carlo, Example 2
- 1.38 Segment 47 - Low-Rank Approximation of Data
- 1.39 Segment 48 - Principal Component Analysis (PCA)

- 2 In-Class Activities
- 2.1 Segment 2 - Bayes
- 2.2 Segment 7 - Central Tendency and Moments
- 2.3 Segment 8 - Some Standard Distributions
- 2.4 Segment 10 - The Central Limit Theorem
- 2.5 Segment 11 - Random Deviates
- 2.6 Segment 13 - The Yeast Genome
- 2.7 Segment 16 - Multiple Hypotheses
- 2.8 Segment 15 - The Towne Family - Again (*)
- 2.9 Segment 20 - Nonlinear Least Squares Fitting
- 2.10 Segment 21 - Marginalize or Condition Uninteresting Fitted Parameters (*)
- 2.11 Segment 23 - Bootstrap Estimation of Uncertainty
- 2.12 Segment 27 - Mixture Models
- 2.13 Segment 28 - Gaussian Mixture Models in 1-D
- 2.14 Segment 29 - GMMs in N-Dimensions (***)
- 2.15 Segment 30 - Expectation Maximization (EM) Methods
- 2.16 Segment 32 - Contingency Tables: A First Look
- 2.17 Segment 33 - Contingency Table Protocols and Exact Fisher Test
- 2.18 Segment 34 - Permutation Tests (***)
- 2.19 Segment 41 - Markov Chain Monte Carlo, Example 2 (**)

## Segments

### Segment 1 - Let's Talk about Probability

### Segment 2 - Bayes

### Segment 3 - Monty Hall

### Segment 4 - The Jailer's Tip

### Segment 5 - Bernoulli Trials

### Segment 6 - The Towne Family Tree

### Segment 7 - Central Tendency and Moments

### Segment 8 - Some Standard Distributions (*)

### Segment 9 - Characteristic Functions

### Segment 10 - The Central Limit Theorem (*)

### Segment 11 - Random Deviates

### Segment 12 - P-Value Tests

### Segment 13 - The Yeast Genome

### Segment 14 - Bayesian Criticism of P-Values

### Segment 16 - Multiple Hypotheses

### Segment 15 - The Towne Family - Again

### Segment 17 - The Multivariate Normal Distribution

### Segment 18 - The Correlation Matrix

### Segment 19 - The Chi Square Statistic

### Segment 20 - Nonlinear Least Squares Fitting

### Segment 21 - Marginalize or Condition Uninteresting Fitted Parameters

### Segment 22 - Uncertainty of Derived Parameters

### Segment 23 - Bootstrap Estimation of Uncertainty

### Segment 24 - Goodness of Fit

### Segment 27 - Mixture Models

### Segment 28 - Gaussian Mixture Models in 1-D (**)

### Segment 29 - GMMs in N-Dimensions

### Segment 30 - Expectation Maximization (EM) Methods

### Segment 31 - A Tale of Model Selection

### Segment 32 - Contingency Tables: A First Look

### Segment 33 - Contingency Table Protocols and Exact Fisher Test (*)

### Segment 34 - Permutation Tests

### Segment 37 - A Few Bits of Information Theory

No problems given.

### Segment 38 - Mutual Information

No problems given.

### Segment 39 - MCMC and Gibbs Sampling (**)

### Segment 40 - Markov Chain Monte Carlo, Example 1 (***)

Turns out I like playing with MCMC. I ran a variety of experiments and made a bunch of colorful plots.

### Segment 41 - Markov Chain Monte Carlo, Example 2

### Segment 47 - Low-Rank Approximation of Data

No problems given.

### Segment 48 - Principal Component Analysis (PCA)

## In-Class Activities

Many of these activities were written from scratch by myself after class to make sure I had a good understanding of the material.

### Segment 2 - Bayes

01-22-14 -- Group 1 -- Class Activities

Knight/Troll/Gnome simulation. Gift box calculations.

### Segment 7 - Central Tendency and Moments

02-03-14 -- Group 4 -- Class Activities

Plotting PDFs of distributions with given mean/variance/skewness/kurtosis.

### Segment 8 - Some Standard Distributions

02-05-14 -- Group 4 -- Class Activities

Given 1000 values, estimate parameters assuming the data is from a few different distributions.

### Segment 10 - The Central Limit Theorem

02-10-14 -- Nick Wilson -- Class Activities

Visualizing how the sum of 12 U(0, 1) variables minus 6 is approximately equal to the normal distribution with mean 0 and variance 1.

### Segment 11 - Random Deviates

Class_Activities -- 02-12-14

Building our own U(0,1) random number generator.

### Segment 13 - The Yeast Genome

02-17-14 -- Group 1 -- Class Activities

Finding regions of chromosome 4 that code for proteins.

### Segment 16 - Multiple Hypotheses

02-21-14 -- Nick Wilson -- Class Activities

P-value practice.

Multiple hypothesis testing on ORFs.

### Segment 15 - The Towne Family - Again (*)

02-24-14 -- Group 1 -- Group Quiz

Group quiz. We won!

### Segment 20 - Nonlinear Least Squares Fitting

03-17-14 -- Nick Wilson -- Class Activities

Least squares fitting: one dataset, several models.

### Segment 21 - Marginalize or Condition Uninteresting Fitted Parameters (*)

03-19-14 -- Nick Wilson -- Class Activities

Find the volcano!

### Segment 23 - Bootstrap Estimation of Uncertainty

Class Activity -- 03-24-14

Estimate the uncertainty in a statistic.

### Segment 27 - Mixture Models

Class Activity -- 03-28-14

Estimate parameters of a mixture model.

### Segment 28 - Gaussian Mixture Models in 1-D

03-31-14 -- Nick Wilson -- Class Activities

Netflix'ish data exploration.

### Segment 29 - GMMs in N-Dimensions (***)

04-02-14 -- Nick Wilson -- Class Activities

Beauty contest: visualize GMM convergence. Won the "Best use of Python" award.

### Segment 30 - Expectation Maximization (EM) Methods

04-04-14 -- Nick Wilson -- Class Activities

EM coin flip activity from the Nature article.

### Segment 32 - Contingency Tables: A First Look

04-09-14 -- Nick Wilson -- Class Activities

Analyzing contingency tables.

### Segment 33 - Contingency Table Protocols and Exact Fisher Test

04-11-14 -- Nick Wilson -- Class Activities

Analyzing chess data with contingency tables.

### Segment 34 - Permutation Tests (***)

04-14-14 -- Nick Wilson -- Class Activities

Speeding up the permutation test. I implemented two approaches, the one from the lecture (expanding the whole table) and the one that samples from the hypergeometric distribution to permute the table.

Example of generating a permutation by sampling from the hypergeometric distribution (see activity for more details):

Original table: [[ 4 1 14] [10 6 14] [51 35 60]] Original state: [[ 0 0 0 19] [ 0 0 0 30] [ 0 0 0 146] [ 65 42 88 0]] Filling in (0,0) hyper(65, 130, 19) [[ 4 0 0 15] [ 0 0 0 30] [ 0 0 0 146] [ 61 42 88 0]] Filling in (0,1) hyper(42, 88, 15) [[ 4 8 0 7] [ 0 0 0 30] [ 0 0 0 146] [ 61 34 88 0]] Filling in (1,0) hyper(61, 122, 30) [[ 4 8 0 7] [ 6 0 0 24] [ 0 0 0 146] [ 55 34 88 0]] Filling in (1,1) hyper(34, 88, 24) [[ 4 8 0 7] [ 6 9 0 15] [ 0 0 0 146] [ 55 25 88 0]] Filling in the first 2 values in the last row. [[ 4 8 0 7] [ 6 9 0 15] [55 25 0 66] [ 0 0 88 0]] Filling in the first 2 values in the last column. [[ 4 8 7 0] [ 6 9 15 0] [55 25 0 66] [ 0 0 66 0]] Filling in the final cell at the bottom right corner [[ 4 8 7 0] [ 6 9 15 0] [55 25 66 0] [ 0 0 0 0]]

### Segment 41 - Markov Chain Monte Carlo, Example 2 (**)

04-25-14 -- Group -- Class Activities

Urns with weighted balls with MCMC.