# Eleisha's Segment 24: Goodness of Fit

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

To Calculate

1. Let X be an R.V. that is a linear combination (with known, fixed coefficients $\displaystyle \alpha_k$ ) of twenty $\displaystyle N(0,1)$ deviates. That is, $\displaystyle X = \sum_{k=1}^{20} \alpha_k T_k where T_k \sim N(0,1)$ . How can you most simply form a t-value-squared (that is, something distributed as $\displaystyle \text{Chisquare}(1)$ from X? For some particular choice of $\displaystyle \alpha_k's$ (random is ok), generate a sample of $\displaystyle x$ 's, plot their histogram, and show that it agrees with $\displaystyle \text{Chisquare}(1)$ .

2. From some matrix of known coefficients $\displaystyle \alpha_{ik}$ with $\displaystyle k=1,\ldots,20$ and $\displaystyle i = 1,\ldots,100$ , generate 100 R.V.s $\displaystyle X_i = \sum_{k=1}^{20} \alpha_{ik} T_k$ where $\displaystyle T_k \sim N(0,1)$ . In other words, you are expanding 20 i.i.d. T_k's into 100 R.V.'s. Form a sum of 100 t-values-squareds obtained from these variables and demonstrate numerically by repeated sampling that it is distributed as $\displaystyle \text{Chisquare}(\nu)$ ? What is the value of $\displaystyle \nu$ ? Use enough samples so that you could distinguish between $\displaystyle \nu$ and $\displaystyle \nu-1$ .

3. Reproduce the table of critical $\displaystyle \Delta\chi^2$ values shown in slide 7. Hint: Go back to segment 21 and listen to the exposition of slide 7. (My solution is 3 lines in Mathematica.)

1. Design a numerical experiment to exemplify the assertions on slide 8, namely that $\displaystyle \chi^2_{min}$ varies by $\displaystyle \pm\sqrt{2\nu}$ from data set to data set, but varies only by $\displaystyle \pm O(1)$ as the fitted parameters $\displaystyle \mathbf b$ vary within their statistical uncertainty?
2. Suppose you want to estimate the central value $\displaystyle \mu$ of a sample of $\displaystyle N$ values drawn from $\displaystyle \text{Cauchy}(\mu,\sigma)$ . If your estimate is the mean of your sample, does the "universal rule of thumb" (slide 2) hold? That is, does the accuracy get better as $\displaystyle N^{-1/2}$ ? Why or why not? What if you use the median of your sample as the estimate? Verify your answers by numerical experiments.