# Segment 7 Sanmit Narvekar

Jump to navigation Jump to search

## Segment 7

#### To Calculate

1. Prove the result of slide 3 the "mechanical way" by setting the derivative of something equal to zero, and solving.

Fairly straightforward. We start by taking the derivative of the original equation (on slide 3) with respect to a, and setting it equal to 0:

$\displaystyle \frac{d}{da} \left( (\langle x^2 \rangle - \langle x \rangle^2) + ( \langle x \rangle - a)^2 \right) = 0$

The first term in the inner parenthesis does not depend on a, so we can remove it. We also expand the second inner term:

$\displaystyle \frac{d}{da} ( \langle x \rangle^2 - 2a \langle x \rangle + a^2 )= 0$

Again, the first term disappears since it doesn't depend on a. We take the derivate of the other terms in the usual way:

$\displaystyle -2 \langle x \rangle + 2a = 0$

And simplify to obtain the final result:

$\displaystyle \langle x \rangle = a$

2. Give an example of a function $\displaystyle p(x)$ , with a maximum at $\displaystyle x=0$ , whose third moment $\displaystyle M_3$ exists, but whose fourth moment $\displaystyle M_4$ doesn't exist.

3. List some good and bad things about using the median instead of the mean for summarizing a distribution's central value.

Advantages of median:

• Less sensitive to outliers. For example, if all your data was centered at 0, except one point which was at 1 million, the mean would be slightly skewed towards the outlier (the amount depends on how many 0s there are). However, the median would accurately capture that most of the data is around 0.

Disadvantages of median:

• Consider a distribution with 2 "peaks". If you wanted to get a sense of where the data was centered, the mean would capture the middle of these two peaks. However, the median would fall on one or the other peak, depending on which had more mass.

#### To Think About

1. This segment assumed that $\displaystyle p(x)$ is a known probability distribution. But what if you know $\displaystyle p(x)$ only experimentally. That is, you can draw random values of x from the distribution. How would you estimate its moments?

Estimating the mean (first moment) is as simple as averaging all the x values you get from sampling from P. It seems like the higher moments can be found by averaging the values of powers of x. For example, the second moment would be the average of x^2 as drawn from P. Not completely sure about this...

2. High moments (e.g., 4 or higher) are algebraically pretty, but they are rarely useful because they are very hard to measure accurately in experimental data. Why is this true?

3. Even knowing that it is useless, how would you find the formula for $\displaystyle I_8$ , the eighth semi-invariant?

Use the cumulant, as referenced in the slides.