
A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea Dongying Wu1,2, Philip Hugenholtz1, Konstantinos Mavromatis1, Rüdiger Pukall3, Eileen Dalin1, Natalia N. Ivanova1, Victor Kunin1, Lynne Goodwin4, Martin Wu5, Brian J. Tindall3, Sean D. Hooper1, Amrita Pati1, Athanasios Lykidis1, Stefan Spring3, Iain J. Anderson1, Patrik Dhaeseleer1,6, Adam Zemla6, Mitchell Singer2, Alla Lapidus1, Matt Nolan1, Alex Copeland1, Cliff Han4, Feng Chen1, Jan-Fang Cheng1, Susan Lucas1, Cheryl Kerfeld1, Elke Lang3, Sabine Gronow3, Patrick Chain1,4, David Bruce4, Edward M. Rubin1, Nikos C. Kyrpides1, Hans-Peter Klenk3 & Jonathan A. Eisen1,2 Sequencing of bacterial and archaeal genomes has revolutionized our understanding of the many roles played by microorganisms1. There are now nearly 1,000 completed bacterial and archaeal genomes available2, most of which were chosen for sequencing on the basis of their physiology. As a result, the perspective provided by the currently available genomes is limited by a highly biased phylogenetic distribution3, 4, 5. To explore the value added by choosing microbial genomes for sequencing on the basis of their evolutionary relationships, we have sequenced and analysed the genomes of 56 culturable species of Bacteria and Archaea selected to maximize phylogenetic coverage. Analysis of these genomes demonstrated pronounced benefits (compared to an equivalent set of genomes randomly selected from the existing database) in diverse areas including the reconstruction of phylogenetic history, the discovery of new protein families and biological properties, and the prediction of functions for known genes from other organisms. Our results strongly support the need for systematic phylogenomic efforts to compile a phylogeny-driven Genomic Encyclopedia of Bacteria and Archaea in order to derive maximum knowledge from existing microbial genome data as well as from genome sequences to come. http://phylogenomics.blogspot.com/2009/12/story-behind-nature-paper-on-phylo...