Travis: Segment 20

1. (See lecture slide 3.) For one-dimensional $x$, the model $y(x | \mathbf b)$ is called "linear" if $y(x | \mathbf b) = \sum_k b_k X_k(x)$, where $X_k(x)$ are arbitrary known functions of $x$. Show that minimizing $\chi^2$ produces a set of linear equations (called the "normal equations") for the parameters $b_k$.
2. A simple example of a linear model is $y(x | \mathbf b) = b_0 + b_1 x$, which corresponds to fitting a straight line to data. What are the MLE estimates of $b_0$ and $b_1$ in terms of the data: $x_i$'s, $y_i$'s, and $\sigma_i$'s?
1. We often rather casually assume a uniform prior $P(\mathbf b)= \text{constant}$ on the parameters $\mathbf b$. If the prior is not uniform, then is minimizing $\chi^2$ the right thing to do? If not, then what should you do instead? Can you think of a situation where the difference would be important?
2. What if, in lecture slide 2, the measurement errors were $e_i \sim \text{Cauchy}(0,\sigma_i)$ instead of $e_i \sim N(0,\sigma_i)$? How would you find MLE estimates for the parameters $\mathbf b$?