CS395T Computational Statistics: Study Guide for Oral Exams (2011)
The oral exam will randomly select from the following lines, one at a
time, and the question will always be the same: "Tell me about...". A
good response can be just a few sentences. You can use the whiteboard
if you want to write down an equation or graph (quickly!). You can
say "next question" if you don't want to answer. This is better than
trying to fake it if you really don't know. The exam grade is based
both on the quality of responses, and the number of questions gotten
through in 20 minutes.
If we don't get through Unit 20 in class, then you are not
responsible for it.
It is not as bad as it sounds. Good luck!
Unit 1: Probability and Inference
(Lecture
1,
2
)
- what is computational statistics?
- probability
- calculus of interence
- probability axioms
- Law of Or-ing, Law of And-ing, Law of Exhaustion
- Law of De-Anding (Law of Total Probability)
- Bayes Theorem
- EME hypotheses
- contrast Bayesians and Frequentists
- probabilities modified by data
- prior probability
- posterior probability
- evidence factor
- Bayes denominator
- background information
- commutativity and associativity of evidence
- the Monty Hall problem
- Hempel's paradox
Unit 2: Bayesian Estimation of Parameters
(Lecture
2,
3,
4
)
- marginalization
- uninteresting parameters in a model
- probability density function
- Dirac delta function
- massed prior
- uniform prior
- uninformative prior
- i.i.d.
- Bernoulli trials
- sufficient statistic
- conjugate prior
- beta distribution
- variable length short tandem repeat (VLSTR)
- binomial distribution
- conditional independence
- naive Bayes models
- improper prior
- log-uniform prior
- paradigm for Bayesian parameter estimation
- statistical model
- data trimming
Unit 3: Common Distributions
(Lecture
4
)
- measures of central tendency
- mean minimizes mean square deviation
- median minimizes mean absolute deviation
- centered moments
- skewness and kurtosis
- standard deviation
- additivity of mean and variance
- semi-invariants
- semi-invariants of Gaussian and Poisson
- normal (Gaussian) distribution
- Student distribution
- Cauchy distribution
- heavy-tailed distributions
- William Sealy Gosset
- exponential distribution
- lognormal distribution
- gamma distribution
- chi-square distribution
- probability density function (PDF)
- cumulative distribution function (CDF)
Unit 4: CLT, Gaussians, MLE
(Lecture
5
)
- central limit theorem (CLT)
- characteristic function of a distribution
- Fourier convolution theorem
- characteristic function of a Gaussian
- characteristic function of Cauchy distribution
- maximum a posteriori (MAP)
- maximum likelihood (MLE)
- sample mean and variance
- estimate parameters of a Gaussian
Unit 5: Random Deviates
(Lecture
6
)
- random deviate
- U(0,1)
- transformation method (random deviates)
- rejection method (random deviates)
- ratio of uniforms method (random deviates)
- squeeze (random deviates)
- Leva's algorithm for normal deviates
Unit 6: p-value (tail) tests
(Lecture
6,
7,
8
)
- p-value test
- null hypothesis
- advantage of tail tests over Bayesian methods
- distribution of p-values under the null hypothesis
- t-values
- Saccharomyces cerevisiae
- A, C, G, T
- multinomial distribution
- p-test critical region
- one-sided vs. two-sided p-value tests
- stopping rule paradox
- likelihood ratio test
- Bayes odds ratio
- Normal approximation to binomial distribution
- Ronald Aylmer Fisher
- posterior predictive p-value
- empirical Bayes
Unit 7: Multiple Hypotheses
(Lecture
8
)
- multiple hypothesis correction
- Bonferroni correction
- false discovery rate (FDR)
- Bayesian approach to multiple hypotheses
Unit 8: Multivariate Normal Distributions and Chi-Square
(Lecture
9,
10
)
- multivariate normal distribution
- covariance matrix
- estimate mean, covariance from multivariate data
- fitting data by a multivariate normal distribution
- slice or projection of a multivariate normal r.v.
- Cholesky decomposition
- how to generate multivariate normal deviates
- how to compute and draw error ellipses
- linear correlation matrix
- test for correlation
- chi-square statistic
- chi-square distribution
- generalization of chi-square to non-independent data
Unit 9: Weighted Nonlinear Least Squares Fitting
(Lecture
10,
11,
12
)
- Normal error model
- correlated Normal error model
- maximum likelihood estimation of parameters
- relation of chi-square to posterior probability
- nonlinear least squares fitting
- chi-square fitting
- accuracy of fitted parameters
- basin of convergence
- Hessian matrix and relation to covariance matrix
- posterior distribution of fitted parameters
- calculation of Hessian matrix
- how to marginalize over uninteresting parameters
- how to condition on known parameter values
- covariance matrix of fitted parameters vs. of data
- consistency (property of MLE)
- asymptotic efficiency (property of MLE)
- Fisher Information Matrix
- asymptotic normality (property of MLE)
- linearized propagation of errors
- sampling the posterior distribution (in least squares fitting)
- bootstrap resampling
- population distribution vs. sample distribution
- drawing with and without replacement
- bootstrap theorem
- honoring (or not) the stated measurement errors
- ratio of two normals as example of something
Unit 10: Confidence Intervals, Goodness of Fit
(Lecture
13,
14
)
- what chi-square value indicates a good fit?
- how to get confidence intervals from chi-square values
- precision improves as square root of data quantity
- degrees of freedom in chi-square fit
- goodness-of-fit p-value (in least squares fitting)
- number of degrees of freedom
- linear constraints (chi-square)
- nonlinear constraints (chi-square)
- pseudo-count
- mean-square error (relation to chi-square)
- what makes a statistic accurately chi-square
- normal approximation to chi-square distribution
- Poisson as approximation to Binomial
- Pearson vs. modified Neyman chi-square
- corrected chi-square statistic for Poisson data
Unit 11: Mixture Models and Gaussian Mixture Models
(Lecture
15,
16
)
- forward statistical model
- mixture model
- assignment vector (mixture model)
- marginalization in mixture models
- hierarchical Bayesian models
- Gaussian mixture model
- expectation-maximization (EM) methods
- probabilistic assignment to components (GMMs)
- Expectation or E-step
- Maximization or M-step
- overall likelihood of a GMM
- log-sum-exp formula
- starting values for GMM iteration
- number of components in a GMM (pros and cons)
- K-means clustering
Unit 12: Theory of EM Methods
(Lecture
16,
17
)
- Jensen's inequality
- concave function (EM methods)
- EM theorem (e.g., geometrical interpretation)
- missing data (EM methods)
- GMM as an EM: what is the missing data, what are the parameters?
Unit 13: Maximum Likelihood Estimation
Unit 12: Theory of EM Methods
(Lecture
17
)
- use of Student distributions vs. normal distribution
- heavy-tailed models in MLE
- model selection
- Akaiki information criterion (AIC)
- Bayes information criterion (BIC)
Unit 14: Contingency Tables
(Lecture
18,
19,
20,
21
)
- contingency table
- cross-tabulation
- row or column marginals
- chi-square or Pearson statistic for contingency table
- conditions vs. factors
- retrospective analysis or case/control study
- hypergeometric distribution
- prospective experiment or longitudinal study
- nuisance parameter
- cross-sectional or snapshot study
- multinomial distribution
- Fisher's Exact Test
- sufficient statistic (re contingency tables)
- Wald statistic (re contingency tables)
- fragility of 2-tailed Fisher Exact Test
- Permutation Test (re contingency tables)
- Monte Carlo calculation
- ordinal vs. nominal data
- advantages of ordinal data (re contingency tables)
- false pos vs. false neg in contingency table permutation test
- Dirichlet distribution as conjugate to multinomial
- how to generate Dirichlet deviates
- p or q as nuisance parameters in experimental protocols (contingency tables)
Unit 15: Information Theory
(Lecture
21,
22
)
- probable vs. improbabile sequences (re entropy)
- Shannon's definition of entropy
- bits vs. nats
- maximally compressed message (re entropy)
- monographic vs. digraphic entropy
- conditional entropy
- mutual information
- side information
- Kelly's formula for proportional betting
- Kullback-Leibler distance
- KL-distance as competitive edge in betting
Unit 16: Markov Chain Monte Carlo
(Lecture
23,
24
)
- Bayes denominator (re MCMC)
- sampling the posterior distribution (re MCMC)
- Markov chain
- detailed balance
- ergodic sequence
- Metropolis-Hastings algorithm
- proposal distribution (re MCMC)
- Gibbs sampler
- waiting time in a Poisson process
- good vs. bad proposal generators in MCMC
Unit 16: Wiener Filtering
(Lecture
25,
26
)
- bases in function space (re Wiener filtering)
- signal vs. noise model (re Wiener filtering)
- Wiener or optimal filter
- spatial or pixel basis
- wavelet basis
- DAUB wavelets
- quadrature mirror filter
- pyramidal algorithm
- wavelet plaid (or, continuity of basis wavelets)
- IRE test chart lady Jane (just kidding)
Unit 18: Laplace interpolation
(Lecture
26
)
- Laplace's equation
- mean value theorem for Laplace's equation
- internal boundary condition
- bi-conjugate gradient method
Unit 19: SVD, PCA, and All That
(Lecture
27,
28
)
- data matrix or design matrix
- singular value decomposition (SVD)
- orthogonal matrix
- optimal decomposition into rank 1 matrices
- singular values
- principal component analysis (PCA)
- diagonalizing the covariance matrix
- how much total variance is explained by principal components?
- dimensional reduction
- main effects (re PCA)
- eigengenes and eigenvarrays
- non-negative matrix factorization
Unit 20 (probably will not get to): Binary Classifiers
- binary classifier
- Type I vs. II error
- TP, FP, FN, TN
- confusion matrix
- one classifier dominates another
- true pos rate, sensitivity, recall
- positive predictive value, precision
- false discovery rate (re classifiers)
- false positive rate
- specificity (re classifiers)
- negative predictive value (re classifiers)
- ROC curve
- convex hull of a ROC curve
- precision-recall curve