# Segment 20 Sanmit Narvekar

## Segment 20

#### To Calculate

1. (See lecture slide 3.) For one-dimensional **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x}**
, the model **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle y(x | \mathbf b)}**
is called "linear" if **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle y(x | \mathbf b) = \sum_k b_k X_k(x)}**
, where **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X_k(x)}**
are arbitrary known functions of **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x}**
. Show that minimizing **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \chi^2}**
produces a set of linear equations (called the "normal equations") for the parameters **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle b_k}**
.

First we write down **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \chi^2}**
, the quantity we wish to minimize:

**Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \chi^2 = \sum_i \left( \frac{y_i - y(x_i|b)}{\sigma_i}\right)^2}**

Where i is the number of training examples or data points. Now we minimize with respect to each parameter b_k:

**Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \frac{\partial \chi^2}{\partial b_k} = 2 \sum_i \left( \frac{y_i - y(x_i|b)}{\sigma_i}\right) \frac{X_k(x)}{\sigma_i} = 0}**

We will drop the constant factor 2 since it doesn't affect the minimization. Now we can expand the inner term and rewrite as follows:

**Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sum_i \frac{1}{\sigma_i^2}y_i X_k(x) = \sum_i \frac{1}{\sigma_i^2} y(x_i|b) X_k(x)}**

Thus, this results in k equations, one for each of the k parameters.

2. A simple example of a linear model is **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle y(x | \mathbf b) = b_0 + b_1 x}**
, which corresponds to fitting a straight line to data. What are the MLE estimates of **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle b_0}**
and **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle b_1}**
in terms of the data: **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x_i}**
's, **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle y_i}**
's, and **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sigma_i}**
's?

Using the above derivation and plugging in the new form of y(x|b), first we solve the first equation to get b_0:

**Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sum_i \frac{1}{\sigma_i^2}y_i = \sum_i \frac{1}{\sigma_i^2} (b_0 + b_1x_i)}**

By some simple rewriting:

**Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle b_0 = \frac{\sum_i \frac{1}{\sigma_i^2} y_i - \sum_i \frac{1}{\sigma_i^2} b_1 x_i}{\sum_i \frac{1}{\sigma_i^2}}}**

We can do the same for b_1. First the normal equation:

**Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sum_i \frac{1}{\sigma_i^2}y_ix_i = \sum_i \frac{1}{\sigma_i^2}(b_0 + b_1x_i) x_i}**

**Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle b_1 = \frac{\sum_i \frac{x_iy_i}{\sigma_i^2} - \sum_i \frac{b_0x_i}{\sigma_i^2}}{\sum_i \frac{x_i^2}{\sigma_i^2}}}**

Note that in the above expression b_0 appears in the equation for b_1 and vice versa. The substitution rule can be used to plug in the value for one in the other, and then it can be solved in the usual way.

#### To Think About

1. We often rather casually assume a uniform prior **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(\mathbf b)= \text{constant}}**
on the parameters **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbf b}**
. If the prior is not uniform, then is minimizing **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \chi^2}**
the right thing to do? If not, then what should you do instead? Can you think of a situation where the difference would be important?

It seems like a non-uniform prior over the parameters is equivalent to regularization. So, this could be useful in a setting where you want to prevent overfitting when you have many parameters

2. What if, in lecture slide 2, the measurement errors were **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle e_i \sim \text{Cauchy}(0,\sigma_i)}**
instead of **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle e_i \sim N(0,\sigma_i)}**
? How would you find MLE estimates for the parameters **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbf b}**
?