# Segment 28. Gaussian Mixture Models in 1-D

## Contents

#### Watch this segment

(Don't worry, what you see statically below is not the beginning of the segment. Press the play button to start at the beginning.)

{{#widget:Iframe |url=http://www.youtube.com/v/n7u_tq0I6jM&hd=1 |width=800 |height=625 |border=0 }}

Links to the slides: PDF file or PowerPoint file

### Problems

#### To Calculate

1. Draw a sample of 100 points from the uniform distribution $U(0,1)$. This is your data set. Fit GMM models to your sample (now considered as being on the interval $-\infty < x < \infty$) with increasing numbers of components $K$, at least $K=1,\ldots,5$. Plot your models. Do they get better as $K$ increases? Did you try multiple starting values to find the best (hopefully globally best) solutions for each $K$?

2. Multiplying a lot of individual likelihoods will often underflow. (a) On average, how many values drawn from $U(0,1)$ can you multiply before the product underflows to zero? (b) What, analytically, is the distribution of the sum of $N$ independent values $\log(U)$, where $U\sim U(0,1)$? (c) Is your answer to (a) consistent with your answer to (b)?

1. Suppose you want to approximate some analytically known function $f(x)$ (whose integral is finite), as a sum of $K$ Gaussians with different centers and widths. You could pretend that $f(x)$ (or some scaling of it) was a probability distribution, draw $N$ points from it and do the GMM thing to find the approximating Gaussians. Now take the limit $N\rightarrow \infty$, figure out how sums become integrals, and write down an iterative method for fitting Gaussians to a given $f(x)$. Does it work? (You can assume that well-defined definite integrals can be done numerically.)