
Tom Michoel wrote:
- Jaak Vilo talked about a method to predict new genes participating in known pathways by overlaying them with expression data (an extension of their tool KEGGanim <http://biit.cs.ut.ee/kegganim/>). They used a completely trivial method (pairwise Pearson correlations with genes in the pathway). I have the feeling it's very easy to improve by modeling the condition dependent expression of genes in a pathway using our tools and then compute the probability for new genes to belong to that pathway. This might be a good project for a new phd student, e.g. focused on combining our arabidopsis expression data with the pathways in reactome. Leuven is also interested in these kind of things.
I would like to add I am currently finalizing a study that a.o. estimates the prediction power of (simple global) coexpression information (including some T-DNA screens to evaluate some predictions). Methods like query-based biclustering approaches (using known genes as queries) indeed should yield better results than simple global measures, but in the end proving your predictions experimentally might be the real challenge. This paper - amongst others - /Annotating Genes of Known and Unknown Function by Large-Scale Coexpression Analysis/ was published this week in Plant Phys and does exactly the same think: linking unknown genes with known (e.g. pathway) genes using global expression similarity measures plus using differentially expressed signatures. Klaas