User:Trettels:Session 5 - 28JAN

From Computational Statistics (CSE383M and CS395T)
Jump to navigation Jump to search

Bernoulli Random Walks

Group: Noah, Danny, Jzhang

MATLAB Function

function [out] = correlatedBernoulli(nn, NN, pp, qq, depth, total, val)
%---------------------------------------
%   correlatedBernoulli -- calculates the probability of finding nn
%   successes out of NN bernoulli trials when each trial is dependant on
%   the previous trials value.  This function is recursive.
%
%
%   Input --
%       nn - the number of successes requested
%       NN - the total number of trials
%       pp - the probability of drawing a 1 (probability of drawing a 
%           0 = 1-p)
%       qq - on successive trials, the probability of drawing the previous
%           draw's value (probability of drawing normally = 1-qq
%       depth - current depth in the tree (used to check for completion)
%       total - running total number of wins (used to check for completion)
%       val - previous draw value;
%---------------------------------------

% define the transition probabilities
t_1 = (1-qq)*pp;  %p(0->1)
t_2 = (1-qq)*(1-pp);  %p(1->0)
t_3 = qq+(1-qq)*(1-pp);  %p(0->0)
t_4 = qq+pp*(1-qq);  %p(1->1)
if(depth==0)  %check if this is the initial call...
    out=pp*correlatedBernoulli(nn,NN,pp,qq,depth+1,1,1) + (1-pp)*correlatedBernoulli(nn,NN,pp,qq,depth+1,0,0);
    return;
elseif(depth<NN && total <=nn && NN-depth>=nn-total) 
   %if this is a middle call, we haven't hit bottom and we haven't drawn too many successes
   %MODIFICATION -- added 'NN-depth >= nn-total' to check that we can potentially reach nn from the remaining nodes on this path
   %  This should reduce run-time for nn > NN/2

    if(val==1) % previous value was 1
        out= t_2 * correlatedBernoulli(nn,NN,pp,qq,depth+1,total, 0) + t_4 * correlatedBernoulli(nn,NN,pp,qq,depth+1,total+1, 1);
        return;
    else  %previous value was 0
        out = t_3 * correlatedBernoulli(nn,NN,pp,qq,depth+1,total, 0) + t_1 * correlatedBernoulli(nn,NN,pp,qq,depth+1,total+1, 1);
        return;
    end
elseif(depth==NN && total==nn) %We've reached bottom and have exactly the number of trials we want
    %fprintf('.');
    out = 1;
    return;
else % too deep with too few successes, or too many successes
    out=0;
    return;
end

Results

n=0, N=12

tic
correlatedBernoulli(0,12,0.8,0.2,0,0,0)
toc

ans =

     2.632434076845341e-06

Elapsed time is 0.015265 seconds.

n=0, N=50

tic
correlatedBernoulli(0,50,0.8,0.2,0,0,0)
toc

ans =

     3.629547908333721e-23

Elapsed time is 0.007834 seconds.

n=0, N=500

tic
correlatedBernoulli(0,500,0.8,0.2,0,0,0)
toc

ans =

    7.870057013241441e-223

Elapsed time is 0.064752 seconds.

n=14, N=15

tic
correlatedBernoulli(14,15,0.8,0.2,0,0,0)
toc

ans =

   0.157965192982114

Elapsed time is 2.982316 seconds.

n=19, N=20

tic
correlatedBernoulli(19,20,0.8,0.2,0,0,0)
toc

ans =

   0.087203067849006

Elapsed time is 96.095040 seconds.

After "can we complete from here" check (N-depth <math>\geq</math> n-total)

n=19, N=20

tic
correlatedBernoulli(19,20,0.8,0.2,0,0,0)
toc

ans =

   0.087203067849006

Elapsed time is 0.034429 seconds.

n=15, N=20

tic
correlatedBernoulli(15,20,0.8,0.2,0,0,0)
toc

ans =

   0.145589645651604

Elapsed time is 5.447780 seconds.

n=10, N=20

tic
correlatedBernoulli(10,20,0.8,0.2,0,0,0)
toc

ans =

   0.007728168283258

Elapsed time is 48.908892 seconds.

Does it sum to one (within machine accuracy)? I think so.

tic
correlatedBernoulli(0,15,0.8,0.2,0,0,0) + correlatedBernoulli(1,15,0.8,0.2,0,0,0) + correlatedBernoulli(2,15,0.8,0.2,0,0,0) +
 correlatedBernoulli(3,15,0.8,0.2,0,0,0) +correlatedBernoulli(4,15,0.8,0.2,0,0,0) + correlatedBernoulli(5,15,0.8,0.2,0,0,0) +
 correlatedBernoulli(6,15,0.8,0.2,0,0,0) + correlatedBernoulli(7,15,0.8,0.2,0,0,0) + correlatedBernoulli(8,15,0.8,0.2,0,0,0) +
 correlatedBernoulli(9,15,0.8,0.2,0,0,0) + correlatedBernoulli(10,15,0.8,0.2,0,0,0) + correlatedBernoulli(11,15,0.8,0.2,0,0,0) +
 correlatedBernoulli(12,15,0.8,0.2,0,0,0) + correlatedBernoulli(13,15,0.8,0.2,0,0,0) + correlatedBernoulli(14,15,0.8,0.2,0,0,0) +
 correlatedBernoulli(15,15,0.8,0.2,0,0,0)
toc

ans =

   1.000000000000001

Elapsed time is 27.771433 seconds.

Note: using values for <math>N</math> as low as 5000 resulted in a segfault in the 32bit Student Edition of MATLAB under Debian linux. The algorithm for this function is straight forward, for being recursive, and should port to other languages without difficulty.

Comment

hi, Trettels. As you might probably know, a lot of same calculations are repeated in this algorithm,
which leads to an exponential elapsed time needed when n=N/2. If the repeated calculations are avoided,
the runtime can be reduced to about n(N-n).  --Kai

Thoughts

While initially it did seem to be faster than the brute force method discussed in class, further runs such as the n=19, N=20 result showed that this was primarily due to n=0 being a rare event. Rather than depending only on N to determine observed run time, both n and N factor in here, with smaller n decreasing the run-time independent of N.

Comments by Bill P.

I think you are really close to winning the $100, which I actually didn't expect to lose! But you are not quite there yet. Wpress 11:34, 30 January 2013 (CST)

Well, close, but not soon enough. Kai has won, with an entry submitted yesterday. I should have read it more closely. Wpress 16:33, 30 January 2013 (CST)


Back to TrettelS Index