# Segment 24. Goodness of Fit

## Contents

#### Watch this segment

(Don't worry, what you see statically below is not the beginning of the segment. Press the play button to start at the beginning.)

Links to the slides: PDF file or PowerPoint file

### Problems

#### To Calculate

1. Let Failed to parse (unknown error): X be an R.V. that is a linear combination (with known, fixed coefficients Failed to parse (unknown error): \alpha_k ) of twenty Failed to parse (unknown error): N(0,1) deviates. That is, $k=1}^{20$ where Failed to parse (unknown error): T_k \sim N(0,1) . How can you most simply form a t-value-squared (that is, something distributed as $Chisquare$ from Failed to parse (unknown error): X ? For some particular choice of Failed to parse (unknown error): \alpha_k 's (random is ok), generate a sample of Failed to parse (unknown error): x 's, plot their histogram, and show that it agrees with $Chisquare$ .

2. From some matrix of known coefficients $ik$ with Failed to parse (unknown error): k=1,\ldots,20 and Failed to parse (unknown error): i = 1,\ldots,100 , generate 100 R.V.s $k=1}^{20} \alpha_{ik$ where Failed to parse (unknown error): T_k \sim N(0,1) . In other words, you are expanding 20 i.i.d. Failed to parse (unknown error): T_k 's into 100 R.V.'s. Form a sum of 100 t-values-squareds obtained from these variables and demonstrate numerically by repeated sampling that it is distributed as $Chisquare$ ? What is the value of Failed to parse (unknown error): \nu ? Use enough samples so that you could distinguish between Failed to parse (unknown error): \nu and Failed to parse (unknown error): \nu-1 .

3. Reproduce the table of critical Failed to parse (unknown error): \Delta\chi^2 values shown in slide 7. Hint: Go back to segment 21 and listen to the exposition of slide 7. (My solution is 3 lines in Mathematica.)

1. Design a numerical experiment to exemplify the assertions on slide 8, namely that $min$ varies by $2\nu$ from data set to data set, but varies only by Failed to parse (unknown error): \pm O(1) as the fitted parameters Failed to parse (unknown error): \mathbf b vary within their statistical uncertainty?

2. Suppose you want to estimate the central value Failed to parse (unknown error): \mu of a sample of Failed to parse (unknown error): N values drawn from $Cauchy$ . If your estimate is the mean of your sample, does the "universal rule of thumb" (slide 2) hold? That is, does the accuracy get better as $-1/2$ ? Why or why not? What if you use the median of your sample as the estimate? Verify your answers by numerical experiments.

### Class Activity

I measured the temperature of my framitron manifold every minute for 1000 minutes, with the same accuracy, Failed to parse (unknown error): \sigma = 5 , for each measurement. The data is plotted below (with data points connected by straight lines), and is in the file Modelselection1.txt.

It's a contest! Which group can write down a model $b$ , where $b$ is a vector of parameters, that gives the best fit to the data in a least squares sense.

Part 1. Any number of parameters in $b$ are allowed.

Part 2. At most 20 parameters are allowed.

Part 3. At most 10 parameters are allowed.

Part 4. At most 4 parameters are allowed.

And, oh by the way, we'll actually test your model on a different realization of the same process, possibly one of the ones shown below.