# Segment 24. Goodness of Fit

#### Watch this segment

(Don't worry, what you see statically below is not the beginning of the segment. Press the play button to start at the beginning.)

The direct YouTube link is http://youtu.be/EJleSVf0Z-U

Links to the slides: PDF file or PowerPoint file

### Problems

#### To Calculate

1. Let **Failed to parse (unknown error): X**
be an R.V. that is a linear combination (with known, fixed coefficients **Failed to parse (unknown error): \alpha_k**
) of twenty **Failed to parse (unknown error): N(0,1)**
deviates. That is, **Failed to parse (unknown error): X = \sum_{k=1}^{20} \alpha_k T_k**
where **Failed to parse (unknown error): T_k \sim N(0,1)**
. How can you most simply form a t-value-squared (that is, something distributed as **Failed to parse (unknown error): \text{Chisquare}(1)**
from **Failed to parse (unknown error): X**
? For some particular choice of **Failed to parse (unknown error): \alpha_k**
's (random is ok), generate a sample of **Failed to parse (unknown error): x**
's, plot their histogram, and show that it agrees with **Failed to parse (unknown error): \text{Chisquare}(1)**
.

2. From some matrix of known coefficients **Failed to parse (unknown error): \alpha_{ik}**
with **Failed to parse (unknown error): k=1,\ldots,20**
and **Failed to parse (unknown error): i = 1,\ldots,100**
, generate 100 R.V.s **Failed to parse (unknown error): X_i = \sum_{k=1}^{20} \alpha_{ik} T_k**
where **Failed to parse (unknown error): T_k \sim N(0,1)**
. In other words, you are expanding 20 i.i.d. **Failed to parse (unknown error): T_k**
's into 100 R.V.'s. Form a sum of 100 t-values-squareds obtained from these variables and demonstrate numerically by repeated sampling that it is distributed as **Failed to parse (unknown error): \text{Chisquare}(\nu)**
? What is the value of **Failed to parse (unknown error): \nu**
? Use enough samples so that you could distinguish between **Failed to parse (unknown error): \nu**
and **Failed to parse (unknown error): \nu-1**
.

3. Reproduce the table of critical **Failed to parse (unknown error): \Delta\chi^2**
values shown in slide 7. Hint: Go back to segment 21 and listen to the exposition of slide 7. (My solution is 3 lines in Mathematica.)

#### To Think About

1. Design a numerical experiment to exemplify the assertions on slide 8, namely that **Failed to parse (unknown error): \chi^2_{min}**
varies by **Failed to parse (unknown error): \pm\sqrt{2\nu}**
from data set to data set, but varies only by **Failed to parse (unknown error): \pm O(1)**
as the fitted parameters **Failed to parse (unknown error): \mathbf b**
vary within their statistical uncertainty?

2. Suppose you want to estimate the central value **Failed to parse (unknown error): \mu**
of a sample of **Failed to parse (unknown error): N**
values drawn from **Failed to parse (unknown error): \text{Cauchy}(\mu,\sigma)**
. If your estimate is the mean of your sample, does the "universal rule of thumb" (slide 2) hold? That is, does the accuracy get better as **Failed to parse (unknown error): N^{-1/2}**
? Why or why not? What if you use the median of your sample as the estimate? Verify your answers by numerical experiments.

### Class Activity

I measured the temperature of my framitron manifold every minute for 1000 minutes, with the same accuracy, **Failed to parse (unknown error): \sigma = 5**
, for each measurement. The data is plotted below (with data points connected by straight lines), and is in the file Modelselection1.txt.

It's a contest! Which group can write down a model **Failed to parse (unknown error): T(t|\mathbf{b})**
, where **Failed to parse (unknown error): \mathbf{b}**
is a vector of parameters, that gives the best fit to the data in a least squares sense.

Part 1. Any number of parameters in **Failed to parse (unknown error): \mathbf{b}**
are allowed.

Part 2. At most 20 parameters are allowed.

Part 3. At most 10 parameters are allowed.

Part 4. At most 4 parameters are allowed.

And, oh by the way, we'll actually test your model on a different realization of the same process, possibly one of the ones shown below.