Predicting Gene Expression from Sequence: A Reexamination
Yuan Yuan, Lei Guo, Lei Shen, Jun S. Liu*
Department of Statistics, Harvard University, Cambridge, Massachusetts,
United States of America
Although much of the information regarding genes’ expressions is
encoded in the genome, deciphering such
information has been very challenging. We reexamined Beer and
Tavazoie’s (BT) approach to predict mRNA expression
patterns of 2,587 genes in Saccharomyces cerevisiae from the
information in their respective promoter sequences.
Instead of fitting complex Bayesian network models, we trained naı¨ve
Bayes classifiers using only the sequence-motif
matching scores provided by BT. Our simple models correctly predict
expression patterns for 79% of the genes, based
on the same criterion and the same cross-validation (CV) procedure as
BT, which compares favorably to the 73%
accuracy of BT. The fact that our approach did not use position and
orientation information of the predicted binding
sites but achieved a higher prediction accuracy, motivated us to
investigate a few biological predictions made by BT.
We found that some of their predictions, especially those related to
motif orientations and positions, are at best
circumstantial. For example, the combinatorial rules suggested by BT
for the PAC and RRPE motifs are not unique to
the cluster of genes from which the predictive model was inferred, and
there are simpler rules that are statistically
more significant than BT’s ones. We also show that CV procedure used by
BT to estimate their method’s prediction
accuracy is inappropriate and may have overestimated the prediction
accuracy by about 10%.
http://compbiol.plosjournals.org/archive/1553-7358/3/11/pdf/10.1371_journal.pcbi.0030243-L.pdf
--
==================================================================
Klaas Vandepoele, PhD
Tel. 32 (0)9 33 13822
VIB Department of Plant Systems Biology, Ghent University
Technologiepark 927, 9052 Gent, Belgium
E-mail: Klaas.Vandepoele@psb.ugent.be
Website: http://bioinformatics.psb.ugent.be/
==================================================================