CS395T/CAM383M Computational Statistics  

Go Back   CS395T/CAM383M Computational Statistics > Previous year: Spring, 2010 > Student Term Projects

Reply
 
Thread Tools Display Modes
  #1  
Old 04-12-2010, 10:50 AM
TheStig TheStig is offline
Member
 
Join Date: Jan 2010
Posts: 27
Default Project by Jonathan Young

Mid-course project report attached.


Tentative timetable for completion:

13 April - 16 April: Construct networks.
17 April - 23 April: Determination of graph properties (part IV.A in attachment)
24 April - 1 May: Analysis of network structure (part IV. B, C in attachment)
Attached Images
File Type: pdf midCourseProjRep.pdf (74.3 KB, 699 views)

Last edited by TheStig; 04-12-2010 at 10:19 PM. Reason: Minor corrections
Reply With Quote
  #2  
Old 04-13-2010, 11:34 AM
wpress wpress is offline
Professor
 
Join Date: Jan 2009
Posts: 222
Default

Looks good. I look forward to reading the final paper!
Reply With Quote
  #3  
Old 04-13-2010, 01:26 PM
johnwoods johnwoods is offline
Member
 
Join Date: Jan 2010
Posts: 30
Default

Couple quick thoughts.

You ask if mouse disease genes are essential. This is actually something that has been looked at a fair bit. Essential genes in other organisms tend to be essential in mouse. Disease genes range from not essential to essential, depending upon what kind of mutation gives rise to the disease. For example, some diseases result from null mutations (complete loss of function), and those genes would not be essential. Other diseases result from partial loss of function, where full loss of function would be lethal; these genes are considered essential. Finally, some genes -- if modified more than very slightly -- lead to lethality, and are therefore essential.

Can you clarify what you mean by essential when you write this up later? It seems like what you mean is: "Can we determine essentiality of genes based on how central they are to the networks?" The answer to that is also a qualified yes, and it's been looked at -- I think by Kris McGary in the Marcotte Lab, but possibly by Insuk Lee (maybe both).

This question of "Which genes are involved in the most diseases?" is a pretty straightforward one. Can you go into more detail?

One way to look at this problem is as a re-framing of the phenolog problem. I suggest discussing that in your final report. A lot of these methods are directly applicable to the current formulation of phenologs.
Reply With Quote
  #4  
Old 04-17-2010, 08:20 PM
Aayush Sharma Aayush Sharma is offline
Member
 
Join Date: Jan 2010
Posts: 15
Default

Seems like the problem is similar to link prediction in graphical models. If this is the case, posing the problem as an MRF (even a sparse Boltzmann Machine ) might help. One can then use variational EM such as employing Mean field approximation in E-step to learn the link weights.
Reply With Quote
  #5  
Old 04-19-2010, 08:55 AM
TheStig TheStig is offline
Member
 
Join Date: Jan 2010
Posts: 27
Default

Quote:
Originally Posted by johnwoods View Post
Couple quick thoughts.

Can you clarify what you mean by essential when you write this up later? It seems like what you mean is: "Can we determine essentiality of genes based on how central they are to the networks?"
Yes, that is what I meant.


Quote:
Originally Posted by johnwoods View Post
The answer to that is also a qualified yes, and it's been looked at -- I think by Kris McGary in the Marcotte Lab, but possibly by Insuk Lee (maybe both).
Ok, I'll check that out.

Quote:
Originally Posted by johnwoods View Post
This question of "Which genes are involved in the most diseases?" is a pretty straightforward one. Can you go into more detail?
Well, I wasn't planning to really go into any more detail than the question implied. I simply may not include this in the project if the question is not so interesting, and instead focus on comparing my approach to past work on evolutionary conservation in human & mouse disease, which seemed to focus mainly on analyses of DNA sequence.

Quote:
Originally Posted by johnwoods View Post
One way to look at this problem is as a re-framing of the phenolog problem. I suggest discussing that in your final report. A lot of these methods are directly applicable to the current formulation of phenologs.
That's a pretty good idea - I'll be sure to include some discussion on phenologs.
Reply With Quote
  #6  
Old 04-19-2010, 08:56 AM
TheStig TheStig is offline
Member
 
Join Date: Jan 2010
Posts: 27
Default

Quote:
Originally Posted by Aayush Sharma View Post
Seems like the problem is similar to link prediction in graphical models. If this is the case, posing the problem as an MRF (even a sparse Boltzmann Machine ) might help. One can then use variational EM such as employing Mean field approximation in E-step to learn the link weights.
Okay, thanks. I'll look into the Markov random fields approach.
Reply With Quote
  #7  
Old 05-05-2010, 03:00 PM
TheStig TheStig is offline
Member
 
Join Date: Jan 2010
Posts: 27
Default Final Project Submission

Project report attached.
Attached Images
File Type: pdf projectReport.pdf (197.3 KB, 704 views)
Reply With Quote
  #8  
Old 05-07-2010, 01:16 PM
wpress wpress is offline
Professor
 
Join Date: Jan 2009
Posts: 222
Default

Interesting data set. But I'm not sure I understand what properties of the genes you were trying to capture by clustering on the two parameters shown in Figure 4 which, to me, just seems like a smooth, somewhat correlated, distribution. Clearly the red-blue dividing line in Figure 4, and the segment dividing lines in Figure 5, have no particular biological significance.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -6. The time now is 05:43 AM.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2018, Jelsoft Enterprises Ltd.