Python 4 - Numpy and Matplotlib

From Computational Statistics (CSE383M and CS395T)
Jump to navigation Jump to search

Before class:

Read the Numpy tutorial here and the Scipy tutorial here. (Note: a different Numpy tutorial was linked in an earlier version of this page.)

Read the Matplotlib tutorial here. If you care to, play around with the 'recreate the figure' problems towards the end.

NOTE: http://nbviewer.ipython.org/ was down for a while on Monday morning but appears to be back up now.

In class:

Define 'the process' to be: add up 12 independent draws from the Uniform[0, 1] distribution and subtract the number 6 from the result.

In this exercise, we will use numpy, scipy, and matplotlib to explore how well the process approximates a draw from a standard normal distribution.

1. Produce a histogram comparing the distribution of 100,000 draws from the process to 100,000 draws from a standard normal distribution.

2. Using characteristic functions, find a way to computationally evaluate the p.d.f. of a random variable produced by the process. Confirm (numerically) that the p.d.f. you produce is properly normalized.

Hints:

  • If X ~ Uniform[0, 1] and Y = X - 1/2, then Y ~ Uniform[-1/2, 1/2].
  • Euler's formula can make some integrals clearer.

3. Produce three plots showing

  • the p.d.f. of the process and the p.d.f. of a standard normal distribution.
  • the point-wise difference between these p.d.f.s.
  • the relative difference between these p.d.f.s.

To think about:

  • Is there anything special about the number 12 in the definition of the process?
  • Invent a situation in which the process would not be an appropriate way to produce an approximately normal deviate.

Bonus:

What is the expected number of Uniform[0,1] draws you need to add up before the sum exceeds 1?

  • Perform a simulation to produce a guess at the answer.
  • Try to prove analytically that your guess is right.

Jeff's code