# User:Kai:Segment7

## To calculate

**1. Prove the result of slide 3 the "mechanical way" by setting the derivative of something equal to zero, and solving.**
<math>\mathrm{d}\Delta^2/\mathrm{d}a = \mathrm{d}(\langle x \rangle -a)^2/\mathrm{d}a = - 2 (\langle x \rangle -a) =0 \Rightarrow \langle x \rangle = a</math>

**2. Give an example of a function p(x), with a maximum at x = 0, whose third moment M3 exists, but whose fourth moment M4 doesn't exist.**

Let <math>p(x) = \begin{cases} \frac{4x^3}{(x^4+1)^2} & \mbox{if } x \geq 0 \\ 0 & \mbox{if } x <0 \\ \end{cases} </math>

It doesn't matter if x=0 is the maximum point of p(x), because, if <math>x_0</math> is the maximum point of <math>p(x)</math>, then let <math>p_1(x) = p(x+x_0)</math>. <math>p_1(x)</math> will satisfy the conditions.

It is easy to prove: <math>\int_{-\infty}^{\infty}p(x)dx = 1 </math>

<math> \int_{-\infty}^{\infty}p(x)x^3 dx = \int_{0}^{\infty}\frac {4x^6}{(x^4+1)^2} dx = \int_{0}^{1}\frac {4x^6}{(x^4+1)^2} dx +\int_{1}^{\infty}\frac {4x^6}{(x^4+1)^2} dx </math>

For the first part, <math>\int_{0}^{1}\frac {4x^6}{(x^4+1)^2} dx < \infty</math>.

For second part, <math>\int_{1}^{\infty}\frac {4x^6}{(x^4+1)^2} dx < \int_{1}^{\infty}\frac {4x^6}{x^8} dx = \int_{1}^{\infty}\frac {4}{x^2} dx =4 </math>.

Thus, <math>M_3</math> exists.

<math> \int_{-\infty}^{\infty}p(x)x^4 dx = \int_{0}^{\infty}\frac {4x^7}{(x^4+1)^2} dx = \int_{0}^{\infty}\frac {d((x^4+1)^2)-d(2x^4+2)}{2(x^4+1)^2} </math>

<math>\int_{0}^{\infty}\frac {d((x^4+1)^2)}{2(x^4+1)^2} \rightarrow \infty</math> while <math>\int_{0}^{\infty}\frac {d(2x^4+2)}{2(x^4+1)^2} = 1 </math>

Thus, <math>M_4</math> does not exist.

**Comment :** Everything sounds great, but how is this function maximum at x = 0?
**Response:** Hi, Kumar, If <math>x_0</math> is the maximum point of <math>p(x)</math>, then let <math>p_1(x) = p(x+x_0)</math>. <math>p_1(x)</math> will satisfy the conditions.

**3. List some good and bad things about using the median instead of the mean for summarizing a distribution's central value.**

Because median is the value separating the lower half and the higher half of a sample, it tells us the middle value of a sample. But it can not represent the central value very well when dealing with a sample with a skewed wide-range distribution. Actually, by comparing mean value and median value, we can get more information about the distribution of the data than just by measuring one of them.

## To Think About

**1. This segment assumed that p(x) is a known probability distribution. But what if you know p(x) only experimentally. That is, you can draw random values of x from the distribution. How would you estimate its moments?**

The kth moments can be estimated by <math>M_k = \sum_{i=1}^N \frac {x_i^n}{N}</math>, where i indicates i-th measurement, <math>x_i</math> is the result of i-th measurement, and N is the total number of measurements. The larger the number of measurements, the closer this formula to the definition of moments.

**2. High moments (e.g., 4 or higher) are algebraically pretty, but they are rarely useful because they are very hard to measure accurately in experimental data. Why is this true?**

I guess it is because every measurement has some error. When the moment is higher, the error will be enlarged more significantly, which makes it less accurate.

**3. Even knowing that it is useless, how would you find the formula for I8, the eighth semi-invariant?**

According to the definition of cumulant in wikipedia, n-th cumulant can be obtained by using <math>\begin{align}g(t) &= \log(\operatorname{E}(e^{tX})) = - \sum_{n=1}^\infty \frac{1}{n}\left(1-\operatorname{E}(e^{tX})\right)^n = - \sum_{n=1}^\infty \frac{1}{n}\left(-\sum_{m=1}^\infty \mu'_m \frac{t^m}{m!}\right)^n \end{align}</math>

- <math>\begin{align} \kappa_1 &= g'(0) = \mu'_1 = \mu, \\

\kappa_2 &= g(0) = \mu'_2 - {\mu'_1}^2 = \sigma^2, \\&{} \ \ \vdots \\ \kappa_n &= g^{(n)}(0), \\ &{} \ \ \vdots \end{align} </math>

Here the moments are central moments, so we can let <math>\mu'_1=0</math>

To calculate <math>\kappa_8</math>, we just need to calculate several combinations in the formal power series of g(t).

<math>\begin{align} \kappa_8 &= -8!\left[\left[-\frac {\mu'_8}{8!} \right]^1+\frac 12 \left[\frac {2\mu'_3 \mu'_5} {3!5!}+\frac {2\mu'_2 \mu'_6} {2!6!}+\frac {\mu'_4 \mu'_4} {4!4!}\right] - \frac 13 \left[\frac {3\mu'_2 \mu'_2\mu'_4}{2!2!4!} +\frac {3\mu'_2 \mu'_3\mu'_3 } {2!3!3!}\right]+\frac 14 \left[ \frac {\mu'_2} {2!}\right]^4 \right] \\ &= \mu'_8-56 \mu'_3 \mu'_5 - 28 \mu'_2 \mu'_6 - 35 {\mu'_4}^2 + 420 {\mu'_2}^2 \mu'_4 + 560 \mu'_2 {\mu'_3}^2-630 {\mu'_2}^4 \end{align}</math>

## Class activity

Teamed with Dan,Sean Trettel,Noah

<math>P(W,B,D|w,b,d = \frac {(W+B+D)!}{W!B!D!}w^W b^B d^D </math>

<math>P(w,b,d|W,B,D) = \frac {P(W,B,D|w,b,d)P(w,b,d|I)}{\iint {P(W,B,D|w,b,1-w-b)P(w,b,1-w-b|I)}dw db}</math>

<math>\iint {P(W,B,D|w,b,1-w-b)P(w,b,1-w-b|I)}dw db = \frac {W!B!D!} {(W+B+D+2)!} </math>

The code for the graph of <math>P(w,b,d|N)</math> is here. The part for parsing text comes from Kumar's work last week.

When N>1000, a direct calculation for <math>P(w,b,d|W,B,D)</math> does not work, because both <math>w^W \text{, }b^B \text{or }d^D \text{and }N!</math> will be out of the range of number representations in computer. The way to solve this problem is to use log() function for exponents and math.lgamma() for factorials first, then use exp() function for the final result to bring it back.

The graphs are:

N=0, W B D = 0 0 0

N=3, W B D = 1 1 1

N=10, W B D = 2 4 4

N=100, W B D = 31 28 41

N=1000, W B D = 368 378 254

N=10000,W B D = 3926 2875 3198

As N becomes larger, the distribution of p(w,b) becomes narrow. When N=10000, we have <math>p(w,b)\approx \delta(w-W/N)\delta(b-B/N)</math>