1.
Differential genome analysis applied to the species-specific features of Helicobacter pylori.
We introduce a simple and rapid strategy to identify genes that are responsible for species-specific phenotypes. The genome of a species that has a specific phenotype is compared with at least one, closely related, species that lacks this phenotype. Homologous genes that are shared among the species compared are identified and discarded from the list of candidates for species-specific genes. The process is automated and rapidly yields a small subset of the genome that likely contains genes responsible for the species-specific features. Functions are assigned to the genes, and dubious annotations are filtered out. Information is extracted not only from the presence of genes, but also from their absence with respect to known phenotypes. We have applied the technique to identify a set of species-specific genes in Helicobacter pylori by comparing it with its closest relatives for which complete genome sequences are available, Haemophilus influenzae and Escherichia coli. Of the genes of this set for which functional features can be obtained, a large fraction (63%, 123 proteins) is (potentially) involved in H. pylori's interaction with its host. We hypothesize that a family of outer membrane proteins is critical for the ability of H. pylori to colonize host cells in highly acidic environments.