# (Rene) Segment 20: Non-linear least squares

## Contents

### Problems

#### To Calculate

1. (See lecture slide 3.) For one-dimensional , the model is called "linear" if , where are arbitrary known functions of . Show that minimizing produces a set of linear equations (called the "normal equations") for the parameters .

is defined as,




The condition for to be a minimum is,




In other words we have that,




Substituting the basis function expansion for




Which is equivalent to the following set of equations, known as the normal equations,




If we define the matrix , the diagonal matrix and the vectors and , the normal equations are conveniently written as,




2. A simple example of a linear model is , which corresponds to fitting a straight line to data. What are the MLE estimates of and in terms of the data: 's, 's, and 's?

The maximum likelihood estimate corresponds to the linear least squares problem:




1. We often rather casually assume a uniform prior on the parameters . If the prior is not uniform, then is minimizing the right thing to do? If not, then what should you do instead? Can you think of a situation where the difference would be important?

2. What if, in lecture slide 2, the measurement errors were instead of ? How would you find MLE estimates for the parameters ?

#### Class activity

Here are the stages that each group should get to:

1. Read in the data and plot the data points, including error bars or some other graphical indication of the 's.

2. Hmm. They look kind of like a raised parabola, don't they? Try fitting a model of the form . What are the best fitting values for ? Plot the best fit curve on the same plot as you produced in stage 1. Does it look like a good fit? What is your value of ?

At this stage you might want to automate your process so that you can quickly plug in the following models and get best-fit parameters, , and a graphical plot.

3. Do a linear fit to see how bad it is:

4. Try an exponential:

5. Try adding a linear term to the parabola to get a general quadratic:

6. Does the ordering of values seem to match your intuitive impression of which curves fits best?

7. Calculate standard errors for your fitted parameters using the Hessian matrix (as described in the segment). Is the value of in stage 5 different enough from zero so that you are sure it isn't zero? (That is, are you justified in adding the extra parameter to the original stage 2 parabola?)