Segment 30. Expectation Maximization (EM) Methods

From Computational Statistics (CSE383M and CS395T)
Jump to navigation Jump to search

Watch this segment

(Don't worry, what you see statically below is not the beginning of the segment. Press the play button to start at the beginning.)

{{#widget:Iframe |url= |width=800 |height=625 |border=0 }}

The direct YouTube link is

Links to the slides: PDF file or PowerPoint file


To Calculate

1. For a set of positive values <math>\{x_i\}</math>, use Jensen's inequality to show (a) the mean of their square is never less than the square of their mean, and (b) their (arithmetic) mean is never less than their harmonic mean.

2. Sharpen the argument about termination of E-M methods that was given in slide 4, as follows: Suppose that <math>g(x) \ge f(x)</math> for all <math>x</math>, for some two functions <math>f</math> and <math>g</math>. Prove that, at any local maximum <math>x_m</math> of <math>f</math>, one of these two conditions must hold: (1) <math>g(x_m) > f(x_m)</math> [in which case the E-M algorithm has not yet terminated], or (2) <math>g(x_m)</math> is a local maximum of <math>g</math> [in which case the E-M algorithm terminates at a maximum of <math>g</math>, as advertised]. You can make any reasonable assumption about continuity of the functions.

To Think About

1. Jensen's inequality says something like "any concave function of a mixture of things is greater than the same mixture of the individual concave functions". What "mixture of things" is this idea being applied to in the proof of the E-M theorem (slide 4)?

2. So slide 4 proves that some function is less than the actual function of interest, namely <math>L(\theta)</math>. What makes this such a powerful idea?


The class activity for Friday can be found at EM activity.