CS395T/CAM383M Computational Statistics  

Go Back   CS395T/CAM383M Computational Statistics > Previous year: Spring, 2010 > Student Term Projects

Reply
 
Thread Tools Display Modes
  #1  
Old 04-11-2010, 07:43 PM
Aayush Sharma Aayush Sharma is offline
Member
 
Join Date: Jan 2010
Posts: 15
Default Aayush Sharma Term Project

Title - Lecture Sides on Generalized Linear Models and Logistic Regression

I tried to introduce the concept of GLMs using the standard least squares case showing how the Gaussian noise model can be shown as a special case of GLM. Gradually, I introduce exponential family of distributions and show how GLMs arise when the noise term is distributed according to this family of distributions. Finally, I introduced the special case of Logistic regression focussing on how the parameters can be learnt as a maximum leikelihood solution. I also give a simple matlab implementation using the glmfit function.

For the purpose of this project, I will stick to canonical link functions as they cover a sufficiently large class of models and have tractable likelihood maximization via gradient ascent/Newton's methods etc.

The writeup is 5 pages long as I wanted to include all the relevant details.
The final deliverables will include

1. Detailed lecture slides
2. Report on the main concepts
3. Data/Code used in the slides.

I would appreciate suggestions/scope for improvement/extensions etc.

Last edited by Aayush Sharma; 05-04-2010 at 11:51 AM.
Reply With Quote
  #2  
Old 04-13-2010, 12:21 PM
wpress wpress is offline
Professor
 
Join Date: Jan 2009
Posts: 222
Default

Looks good. You might consider organizing things to first do a simple example of logistic regression, and then generalize to the GLM.
Reply With Quote
  #3  
Old 04-19-2010, 09:26 AM
TheStig TheStig is offline
Member
 
Join Date: Jan 2010
Posts: 27
Default

Very thorough, so I don't have much to add; perhaps include a discussion or comparison of general and generalized linear models?
Reply With Quote
  #4  
Old 05-04-2010, 11:48 AM
Aayush Sharma Aayush Sharma is offline
Member
 
Join Date: Jan 2010
Posts: 15
Default Final Slides

Attached are the final slides on Generalized Linear Models and Logistic Regression. Also learn_theta is a gradient ascent implementation for learning the parameters of a logistic regression model. The rest of the code snippets are included in the relevant slides. E.coli is the dataset from UCI machine learning repository used for evaluating logistic regression in the slides. Fisher iris dataset comes pre-loaded with Matlab.

I have re-organized things to have logistic regression first followed by generalization to GLMs.
Attached Images
File Type: pdf final_slides.pdf (413.6 KB, 2831 views)
File Type: pdf Report.pdf (171.8 KB, 2435 views)
Attached Files
File Type: txt learn_theta.txt (1.2 KB, 586 views)
File Type: txt ecoli.txt (9.9 KB, 1084 views)

Last edited by Aayush Sharma; 05-04-2010 at 11:58 AM.
Reply With Quote
  #5  
Old 05-07-2010, 11:51 AM
wpress wpress is offline
Professor
 
Join Date: Jan 2009
Posts: 222
Default

Nice job!

Two minor comments:
Slide 6, last bullet, is only nonlinear in a very mild way; general nonlinear fit would be much harder.
Slide 8: Taking sigmas constant is not generally a good model (depending on the application).
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -6. The time now is 02:55 PM.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2018, Jelsoft Enterprises Ltd.