EM activity

From Computational Statistics Course Wiki
Jump to navigation Jump to search

Today's exercise is about this paper.

  1. Read the paper, up to the end of the Mathematical foundations section.
  2. Explain how every term in the formulation of the EM algorithm on the third line of slide 5 of segment 30 appears in the algorithm outlined in Figure 1(b) of the paper. (In other words, do for the algorithm in Figure 1(b) what slide 6 in the segment does for GMMs.) Whenever an equation can be made specific to the exact model in the paper, try to do so. Specifically:
    1. What are Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \textbf{z}} , Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \textbf{x}} and Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \theta}  ?
    2. What is Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(\textbf{x} | \textbf{z} \theta)}
    3. What is Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(\textbf{z} | \textbf{x} \theta')}  ?
    4. What is Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(\textbf{z} | \theta)}  ?
    5. What is a full statement of the function to be maximized in the M-step?
    6. The paper never explicitly states the closed-form solution to this M-step maximization, but it can be reverse-engineered from the figure. What is this formula?
  3. Implement both the algorithm outlined in Figure 1(b) of the paper and the naive algorithm given in the paragraph beginning “One iterative scheme for obtaining completions could work as follows:” in the third column of the first page of the paper. Compare the performance of these two algorithms on the data set in the paper.
  4. Extra bonus challenge: show that your answer to 2.6 really does maximize your answer to 2.5. (This is actually pretty tricky!)