CS395T/CAM383M Computational Statistics  

Go Back   CS395T/CAM383M Computational Statistics > CS395T/CAM383M (Spring 2011) Lectures and Student Participation > Homework Assignments and Student Postings > HW 6

Reply
 
Thread Tools Display Modes
  #1  
Old 04-14-2011, 06:36 PM
ilevy ilevy is offline
Member
 
Join Date: Jan 2011
Posts: 43
Default Problem 3

Based on courtney's post. I tried to do it independly but looked at hers for help. End result is basically the same code though. Planning to do some analysis of convergence based on multiple runs of this tonight.

Probably will add in logic here to initiate a guess from random segmentation of the trails. Then can loop this function with either a constant prob matrix or one drawn from a distribution - probably uniform random on all fields.

Code:
% typical input em1(5,10,[.5; .9], [.1; .4])
function [delta guess] = em1(numTrials,numFlips,probs,guess)

numCoins = size(probs,1);

trueCoins = randsample(numCoins,numTrials,true);
flipCounts = binornd(numFlips, probs(trueCoins));
estMat = [flipCounts'; numFlips - flipCounts']

iter = 1;
delta(iter) = 1;
prMat = zeros(numTrials,numCoins);
while delta(iter) > 1e-3
  for i=1:numCoins
    prMat(:,i) = binopdf(flipCounts,numFlips,guess(i,iter));
  end

  rSum = repmat(sum(prMat,2),1,numCoins);
  prMat = prMat ./ rSum;

  counts = estMat * prMat;
  
  iter = iter + 1;
  guess(:,iter) = (counts(1,:) ./ sum(counts))';
  delta(:,iter) = max(abs(guess(:,iter)-guess(:,iter-1)));

  if (iter > 50) % sanity check
    break
  end
end
Did you do multiple trials for each set of fixed settings? I'm thinking of drawing initial probabilities from a distribution and maybe doing some stats on convergence rates. But I've spent enough time just doing em1 for now.
Reply With Quote
  #2  
Old 04-15-2011, 06:55 AM
smirarab smirarab is offline
Member
 
Join Date: Jan 2011
Posts: 34
Default

Quote:
Originally Posted by ilevy View Post
Did you do multiple trials for each set of fixed settings? I'm thinking of drawing initial probabilities from a distribution and maybe doing some stats on convergence rates. But I've spent enough time just doing em1 for now.
I haven't done an experiment either, but I did informally run several cases and I observed two things:
1- algorithm is sensitive to whether you get the relation between the two outcomes right in your initial guess. By that I mean, if the true probability of coin 1 is higher than coin 2, and your initial guess assigns higher probability to coin 2, the algorithm will find the reverse results. In essence it doesn't have a way to know which coin is which.
2- other than that, in my experiment, the initial guess does not matter too much for the case when we optimize weighted assignments (figure 1.b).
Reply With Quote
  #3  
Old 04-15-2011, 04:26 PM
ilevy ilevy is offline
Member
 
Join Date: Jan 2011
Posts: 43
Default

OK so I ran some simulations. Feel free to use it...

Code:
function [delta guess] = emCoin(numTrials,numFlips,probs, alg)

numCoins = size(probs,1);

trueCoins = randsample(numCoins,numTrials,true);
flipCounts = binornd(numFlips, probs(trueCoins));

guess = rand(numCoins,1);
delta = 1;

iter = 1;
prMat = zeros(numTrials,numCoins);
while delta(iter) > 1e-3
  prMat = binopdf(repmat(flipCounts',numCoins,1),numFlips,...
                  repmat(guess(:,iter),1,numTrials));

  if (alg == 1)
    cSum = repmat(sum(prMat),numCoins,1);
    estMat = prMat ./ cSum;
  else
    [M, I] = max(prMat);
    estMat = sparse(I,1:numTrials,1,numCoins,numTrials);
  end
    
  iter = iter + 1;
  guess(:,iter) = estMat * flipCounts ./ (sum(estMat,2) * numFlips);
  delta(:,iter) = max(abs(guess(:,iter)-guess(:,iter-1)));

  if (iter > 100)
    break
  end
end
Code:
function [A D] = simCoin(numSims,t,f,prob,alg)

sProb = sort(prob);

A = zeros(numSims,1);
D = zeros(numSims,1);
for i=1:numSims
  [curD curG] = emCoin(t,f,sProb,alg);
  A(i) = norm(sort(curG(:,end))-sProb);
  D(i) = size(curD,2) - 1;
end
The simplistic algorithm is kind of a joke. It doesn't find things very well. My statistic is normed distance from actual probabilities. Here's a hist of the algorithm for 10 trials, 5 flips each, with true probability [.7; .3]


Obviously with more data the probablilities estimates get better...

Last edited by ilevy; 04-15-2011 at 04:33 PM.
Reply With Quote
  #4  
Old 04-18-2011, 11:25 AM
smirarab smirarab is offline
Member
 
Join Date: Jan 2011
Posts: 34
Default

Quote:
Originally Posted by ilevy View Post
The simplistic algorithm is kind of a joke. It doesn't find things very well.
hmmm, in my experience it was working pretty ok. I think the reason it doesn't work well for you is that your initial guess is random. I don't know if a random initial guess is realistic.
Reply With Quote
  #5  
Old 04-18-2011, 11:53 AM
ilevy ilevy is offline
Member
 
Join Date: Jan 2011
Posts: 43
Default

I tried a couple ways of doing the initial guess. I started by partitioning the trials and assigning some subset to each coin estimate. That worked fine, but random seemed simpler and I figured it should converge as well. Perhaps not in the first case.

I didn't calculate final likelihood estimates, but rather distance from the true values. The two should be fairly related, but maybe the 'max' alg achieves ok likelihood with less accurate estimates. Not sure exactly why that would be, but it's possible since there aren't guarantees about it's convergence.
Reply With Quote
  #6  
Old 04-19-2011, 01:26 PM
Courtney Courtney is offline
Member
 
Join Date: Jan 2011
Posts: 14
Default Initial Guess

So this is a case in which the method may be sensitive to the initial guess... I wouldn't advise a purely random one.

Why not have the guess as the true value plus a random error ranging from 0 to 0.2 either way. Remember that we have an observer basing his guess on the data at hand.
If I saw HHHTHHHHT, I would think that the coin has a bias greater than .5, but that's about it!

It's not such a big deal with two coins... but when you add more, the second algorithm requires a semi-decent guess.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -6. The time now is 09:34 AM.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2018, Jelsoft Enterprises Ltd.