# Difference between revisions of "Eleisha's Segment 24: Goodness of Fit"

(Created page with "<b> To Calculate </b> 1. Let X be an R.V. that is a linear combination (with known, fixed coefficients <math> \alpha_k </math> ) of twenty <math> N(0,1) </math> deviates. Th...") |
|||

Line 3: | Line 3: | ||

1. Let X be an R.V. that is a linear combination (with known, fixed coefficients <math> \alpha_k </math> ) of twenty <math> N(0,1) </math> deviates. That is, <math> X = \sum_{k=1}^{20} \alpha_k T_k where T_k \sim N(0,1) </math>. How can you most simply form a t-value-squared (that is, something distributed as <math> \text{Chisquare}(1) </math> from X? For some particular choice of <math> \alpha_k's </math> (random is ok), generate a sample of <math>x</math>'s, plot their histogram, and show that it agrees with <math> \text{Chisquare}(1)</math>. | 1. Let X be an R.V. that is a linear combination (with known, fixed coefficients <math> \alpha_k </math> ) of twenty <math> N(0,1) </math> deviates. That is, <math> X = \sum_{k=1}^{20} \alpha_k T_k where T_k \sim N(0,1) </math>. How can you most simply form a t-value-squared (that is, something distributed as <math> \text{Chisquare}(1) </math> from X? For some particular choice of <math> \alpha_k's </math> (random is ok), generate a sample of <math>x</math>'s, plot their histogram, and show that it agrees with <math> \text{Chisquare}(1)</math>. | ||

− | 2. From some matrix of known coefficients \alpha_{ik} with k=1,\ldots,20 and i = 1,\ldots,100, generate 100 R.V.s X_i = \sum_{k=1}^{20} \alpha_{ik} T_k where T_k \sim N(0,1). In other words, you are expanding 20 i.i.d. T_k's into 100 R.V.'s. Form a sum of 100 t-values-squareds obtained from these variables and demonstrate numerically by repeated sampling that it is distributed as \text{Chisquare}(\nu)? What is the value of \nu? Use enough samples so that you could distinguish between \nu and \nu-1. | + | 2. From some matrix of known coefficients <math> \alpha_{ik} </math> with <math> k=1,\ldots,20 </math> and <math> i = 1,\ldots,100 </math>, generate 100 R.V.s <math>X_i = \sum_{k=1}^{20} \alpha_{ik} T_k </math> where <math>T_k \sim N(0,1) </math>. In other words, you are expanding 20 i.i.d. T_k's into 100 R.V.'s. Form a sum of 100 t-values-squareds obtained from these variables and demonstrate numerically by repeated sampling that it is distributed as <math>\text{Chisquare}(\nu) </math>? What is the value of <math>\nu </math>? Use enough samples so that you could distinguish between <math>\nu </math>and <math>\nu-1 </math>. |

− | 3. Reproduce the table of critical \Delta\chi^2 values shown in slide 7. Hint: Go back to segment 21 and listen to the exposition of slide 7. (My solution is 3 lines in Mathematica.) | + | 3. Reproduce the table of critical <math>\Delta\chi^2 </math> values shown in slide 7. Hint: Go back to segment 21 and listen to the exposition of slide 7. (My solution is 3 lines in Mathematica.) |

<b> To Think About </b> | <b> To Think About </b> | ||

− | |||

− | 2. Suppose you want to estimate the central value \mu of a sample of N values drawn from \text{Cauchy}(\mu,\sigma). If your estimate is the mean of your sample, does the "universal rule of thumb" (slide 2) hold? That is, does the accuracy get better as N^{-1/2}? Why or why not? What if you use the median of your sample as the estimate? Verify your answers by numerical experiments. | + | 1. Design a numerical experiment to exemplify the assertions on slide 8, namely that <math>\chi^2_{min}</math> varies by <math>\pm\sqrt{2\nu} </math> from data set to data set, but varies only by <math>\pm O(1) </math> as the fitted parameters <math> \mathbf b </math> vary within their statistical uncertainty? |

+ | |||

+ | 2. Suppose you want to estimate the central value <math>\mu </math> of a sample of <math>N </math> values drawn from <math>\text{Cauchy}(\mu,\sigma) </math>. If your estimate is the mean of your sample, does the "universal rule of thumb" (slide 2) hold? That is, does the accuracy get better as <math> N^{-1/2} </math>? Why or why not? What if you use the median of your sample as the estimate? Verify your answers by numerical experiments. | ||

+ | |||

+ | <b> Back To: </b> [[Eleisha Jackson]] |

## Latest revision as of 11:44, 3 April 2014

** To Calculate **

1. Let X be an R.V. that is a linear combination (with known, fixed coefficients **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \alpha_k }**
) of twenty **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle N(0,1) }**
deviates. That is, **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X = \sum_{k=1}^{20} \alpha_k T_k where T_k \sim N(0,1) }**
. How can you most simply form a t-value-squared (that is, something distributed as **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \text{Chisquare}(1) }**
from X? For some particular choice of **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \alpha_k's }**
(random is ok), generate a sample of **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x}**
's, plot their histogram, and show that it agrees with **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \text{Chisquare}(1)}**
.

2. From some matrix of known coefficients **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \alpha_{ik} }**
with **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle k=1,\ldots,20 }**
and **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle i = 1,\ldots,100 }**
, generate 100 R.V.s **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X_i = \sum_{k=1}^{20} \alpha_{ik} T_k }**
where **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle T_k \sim N(0,1) }**
. In other words, you are expanding 20 i.i.d. T_k's into 100 R.V.'s. Form a sum of 100 t-values-squareds obtained from these variables and demonstrate numerically by repeated sampling that it is distributed as **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \text{Chisquare}(\nu) }**
? What is the value of **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \nu }**
? Use enough samples so that you could distinguish between **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \nu }**
and **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \nu-1 }**
.

3. Reproduce the table of critical **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \Delta\chi^2 }**
values shown in slide 7. Hint: Go back to segment 21 and listen to the exposition of slide 7. (My solution is 3 lines in Mathematica.)

** To Think About **

1. Design a numerical experiment to exemplify the assertions on slide 8, namely that **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \chi^2_{min}}**
varies by **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \pm\sqrt{2\nu} }**
from data set to data set, but varies only by **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \pm O(1) }**
as the fitted parameters **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbf b }**
vary within their statistical uncertainty?

2. Suppose you want to estimate the central value **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mu }**
of a sample of **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle N }**
values drawn from **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \text{Cauchy}(\mu,\sigma) }**
. If your estimate is the mean of your sample, does the "universal rule of thumb" (slide 2) hold? That is, does the accuracy get better as **Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle N^{-1/2} }**
? Why or why not? What if you use the median of your sample as the estimate? Verify your answers by numerical experiments.

** Back To: ** Eleisha Jackson