# Travis: Segment 28

1. Draw a sample of 100 points from the uniform distribution $U(0,1)$. This is your data set. Fit GMM models to your sample (now considered as being on the interval $-\infty < x < \infty$) with increasing numbers of components $K$, at least $K=1,\ldots,5$. Plot your models. Do they get better as $K$ increases? Did you try multiple starting values to find the best (hopefully globally best) solutions for each $K$?
2. Multiplying a lot of individual likelihoods will often underflow. (a) On average, how many values drawn from $U(0,1)$ can you multiply before the product underflows to zero? (b) What, analytically, is the distribution of the sum of $N$ independent values $\log(U)$, where $U\sim U(0,1)$? (c) Is your answer to (a) consistent with your answer to (b)?
1. Suppose you want to approximate some analytically known function $f(x)$ (whose integral is finite), as a sum of $K$ Gaussians with different centers and widths. You could pretend that $f(x)$ (or some scaling of it) was a probability distribution, draw $N$ points from it and do the GMM thing to find the approximating Gaussians. Now take the limit $N\rightarrow \infty$, figure out how sums become integrals, and write down an iterative method for fitting Gaussians to a given $f(x)$. Does it work? (You can assume that well-defined definite integrals can be done numerically.)