CS395T Computational Statistics with Application to Bioinformatics  

Go Back   CS395T Computational Statistics with Application to Bioinformatics > CS395T (Spring 2010) Student Participation Forum > Lecture Slides

Reply
 
Thread Tools Display Modes
  #11  
Old 02-11-2010, 10:33 AM
Corey Bryant Corey Bryant is offline
Member
 
Join Date: Jan 2010
Posts: 13
Default ratio of uniforms proof

In searching for further explanation of the ratio of uniforms method I looked in NR. There is a proof relating ratio of uniforms method to rejection method that I found (and think others may find) helpful in my understanding. I chose not to reiterate the proof here since we all have access to NR. It starts on p. 368 for anyone interested.
Reply With Quote
  #12  
Old 02-11-2010, 01:05 PM
cqian cqian is offline
Member
 
Join Date: Jan 2010
Posts: 16
Default

Quote:
Originally Posted by Corey Bryant View Post
In searching for further explanation of the ratio of uniforms method I looked in NR. There is a proof relating ratio of uniforms method to rejection method that I found (and think others may find) helpful in my understanding. I chose not to reiterate the proof here since we all have access to NR. It starts on p. 368 for anyone interested.
I havn't received NR yet. Is it a proof by explanation, or a proof using Bayes like the one I posted above?
Reply With Quote
  #13  
Old 02-12-2010, 08:36 AM
Chao Ruan Chao Ruan is offline
Member
 
Join Date: Jan 2010
Posts: 32
Default Answer to part 1 &2

an intuitive explanation of why the rejection method for producing random deviates works

In the rejection method, we always come up with another p.d.f which is larger everywhere than the original one, and the inverse of its c.d.f is easy to calculate. This will save us a lot of trouble and make generating process much faster if its hard to find the inverse c.d.f of original function. The correctness of this process holds because the first uniformly random deviate can generate data which has the distribution of our "easier p.d.f", according to the correctness of Transformation Method. But note that this p.d.f is not the original one, but the relationship between the easier one and original one is clear: we just have to get the ratio of this two p.d.f at the particular point, which can be implemented by having the second uniformly random deviate between 0 and the easier p.d.f. It has some chance of falling below the original p.d.f., and this probability is exactly the ratio of the two p.d.f.

a rigorous proof of why the rejection method works
Proof: Follow the notation in class, denote the original p.d.f as p(x), the easier p.d.f as f(x), the c.d.f of f(x) to be F(x), and uniform distribution to be g(y).

Firstly, we have to proof that the first deviate can generate data have the same distribution as f(x).
According to the fundamental transformation law of probabilities for two probability distribution functions f(x) and g(y),
|f(x)dx| = |g(y)dy|
so

Because g(y) is uniformly distributed,
[tex]f(x) =
Thus


So we find a relationship between x and y, and using this relationship, we can always generate f(x) distribution numbers using a uniformly random generator. But our goal is p(x).

So in the second step of the proof, we have to prove the second deviate can generate p(x) distribution numbers from f(x) distribution numbers.
After generating a f(x) distribution number , the probability of it is , but we want the probability to be . The second deviate has accept probability of

which means, if we accept this data, then the probability of this data is

That is exactly the probability we want. Otherwise, the we reject , and using this 2 times deviates again until successfully get an accepted data.

Last edited by Chao Ruan; 02-12-2010 at 08:50 AM.
Reply With Quote
  #14  
Old 02-12-2010, 02:38 PM
simboden simboden is offline
Member
 
Join Date: Jan 2010
Posts: 16
Default Intuitive Explanation of the Rejection Method

I'm not entirely sure exactly how often the rejection method needs to be applied, but it seems like more often than not, you'd find yourself with a distribution that isn't perfect. The inverse of such a distribution therefore wouldn't be available thus rendering the transformation method useless. So you choose a best fit distribution that follows the original as close as possible. The closer the fit, the less rejections necessary.

Correct me if I'm wrong - I'm not exactly well trained in this sort of thinking - but it's equivalent to a two dimensional box with a big red circle in the middle and blue empty space. Allow the red circle to represent the original distribution f, but in order to put it onto a poster to throw darts at, you had to fill the empty space around it with blue. As you throw darts at it, if you get it in the circle that's fine, but if you get it in the blue area, you have to throw another. The size of the red vs the size of the blue depends on the size of the poster. If you keep the size of the red circle fixed but get a bigger poster, then it's more likely that you'll have to throw a few times to get it into the red circle. However, the closer you get the poster to the edges of the circle, the less likely you'll need multiple throws, but it's still random where the dart will end up.

Or at least that's how I think of it. In a ridiculously simplified manner.
Reply With Quote
  #15  
Old 02-12-2010, 03:44 PM
namphuon namphuon is offline
Member
 
Join Date: Jan 2010
Posts: 25
Default

Quote:
Originally Posted by simboden View Post
I'm not entirely sure exactly how often the rejection method needs to be applied, but it seems like more often than not, you'd find yourself with a distribution that isn't perfect. The inverse of such a distribution therefore wouldn't be available thus rendering the transformation method useless. So you choose a best fit distribution that follows the original as close as possible. The closer the fit, the less rejections necessary.

Correct me if I'm wrong - I'm not exactly well trained in this sort of thinking - but it's equivalent to a two dimensional box with a big red circle in the middle and blue empty space. Allow the red circle to represent the original distribution f, but in order to put it onto a poster to throw darts at, you had to fill the empty space around it with blue. As you throw darts at it, if you get it in the circle that's fine, but if you get it in the blue area, you have to throw another. The size of the red vs the size of the blue depends on the size of the poster. If you keep the size of the red circle fixed but get a bigger poster, then it's more likely that you'll have to throw a few times to get it into the red circle. However, the closer you get the poster to the edges of the circle, the less likely you'll need multiple throws, but it's still random where the dart will end up.

Or at least that's how I think of it. In a ridiculously simplified manner.
Yeah, I think that's a correct way of looking at it. I think here's how you would diagram your dart example using pictures. Let the p(x) be the original pdf. To make this easy to see, let p(x) be bin(x, 50,.5), basically, the distribution of 50 fair coin flips.

Now, let's surround p(x) with a box with width and height (X,Y).



If we pick a pair (X1,Y1) uniformly from the box, and only accept if it's below p(x), it's clear that the rate of acceptance is proportional p(x)/Y. Thus, if we generate a histogram of the samples generated by the rejection method, it will look like the original pdf.



Like your dart example, where p(x)/Y is very small, it will have few hits, and where p(x)/Y is very large, it will have more hits. This is a very simplified example.

http://www.cs.utexas.edu/~namphuon/cs395C/rejection.m
Reply With Quote
  #16  
Old 02-12-2010, 04:06 PM
Saurabh Saurabh is offline
Member
 
Join Date: Jan 2010
Posts: 26
Default

Quote:
Originally Posted by jhussmann View Post
To encourage more active discussion on the forums, as part of HW 3, someone should:

- post an intuitive explanation of why the rejection method for producing random deviates works

- post a rigorous proof of why the rejection method works

- post a proof of why the ratio-of-uniforms methods works
  • Several intuitive explanations have been posted.
  • I think the proof posted by TheStig is clear and rigorous enough.
  • A rigorous proof of ratio-of-uniforms method is given in the book "Automatic Nonuniform Random Variate Generation" by Hormann, Leydold and Derflinger. It has a neat explanation of why the method works. Basically, the area under the pdf of the variable we wish to generate is mapped into a half-disc in the upper plane. The electronic version of the book can be accessed from library website. The method and its proof begins on page 32.
Reply With Quote
  #17  
Old 02-12-2010, 04:16 PM
Saurabh Saurabh is offline
Member
 
Join Date: Jan 2010
Posts: 26
Default

Quote:
Originally Posted by simboden View Post
I'm not entirely sure exactly how often the rejection method needs to be applied, but it seems like more often than not, you'd find yourself with a distribution that isn't perfect. The inverse of such a distribution therefore wouldn't be available thus rendering the transformation method useless. So you choose a best fit distribution that follows the original as close as possible. The closer the fit, the less rejections necessary.
I wouldn't call it useless. Even if you don't know the analytical inverse of the pdf, you can always use a numerical procedure (Newton-Raphson is pretty fast) to calculate the inverse for each sampling. Consider a hypothetical situation that the pdf is such that area between the "encapsulating" function and the pdf is huge, so that there are huge number of rejections. So you might be better off calculating inverse numerically and use transformation method.

Since I've never worked in the field, I have NO idea if such a situation actually arises.
Reply With Quote
  #18  
Old 02-12-2010, 09:58 PM
KyongHwa Bae KyongHwa Bae is offline
Member
 
Join Date: Jan 2010
Posts: 10
Default Intuition for rejection Method

Simple intuition that the rejection Method works is as follows.

We sample x from the pdf f(x) that we can sample right away.
Then, the intuition is that x, the sample we just obtained from f(x) is not likely to happen if we have sampled with p(x), which is the pdf we intend to sample from.
The likely ratio that the x will not happen from p(x) against f(x) is p(x)/f(x).
Thus, we sample another from uniform distribution, and reject if sample is greater than the ratio.
Then we are rejecting the sample with the likelyhood that it would not happen with p(x) against f(x).
Reply With Quote
  #19  
Old 02-12-2010, 11:21 PM
KyongHwa Bae KyongHwa Bae is offline
Member
 
Join Date: Jan 2010
Posts: 10
Default Proof for Rejection and Uniforms Method

Solution is attatched.
Attached Images
File Type: pdf HW3_1.pdf (67.2 KB, 26 views)
Reply With Quote
  #20  
Old 05-09-2010, 12:51 AM
Saurabh Saurabh is offline
Member
 
Join Date: Jan 2010
Posts: 26
Default

I have so much faith in Mathematica that I was surprised to see on Slide 3 that it cannot compute the characteristic function of Cauchy Distribution. I tried to compute the Fourier Transform directly and it does give the result.

The mathematica notebook is attached.
Attached Files
File Type: zip cauchy.zip (1.5 KB, 7 views)
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -6. The time now is 09:49 AM.


Powered by vBulletin® Version 3.8.4
Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.