3.
Functional and evolutionary significance of unknown genes from uncultivated taxa.
Rodríguez Del Río Á, Giner-Lamia J, Cantalapiedra CP, Botas J, Deng Z, Hernández-Plaza A, Munar-Palmer M, Santamaría-Hernando S, Rodríguez-Herva JJ, Ruscheweyh HJ, Paoli L,
Schmidt TSB,
Sunagawa S,
Bork P, López-Solanilla E,
Coelho LP,
Huerta-Cepas JMany of Earth's microbes remain uncultured and under-studied, limiting our understanding of the functional and evolutionary aspects of their genetic material, which remain largely overlooked in most metagenomic studies. Here, we analyzed 149,842 environmental genomes from multiple habitats and compiled a curated catalog of 404,085 functionally and evolutionarily significant novel (FESNov) gene families exclusive to uncultivated prokaryotic taxa. All FESNov families span multiple species, exhibit strong signals of purifying selection, and qualify as new orthologous groups, thus nearly tripling the number of bacterial and archaeal gene families described to date. The FESNov catalog is enriched in clade-specific traits, including 1,034 novel families that can distinguish entire uncultivated phyla, classes, and orders, likely representing synapomorphies that facilitated their evolutionary divergence. Using genomic context analysis and structural alignments, we predicted functional associations for 32.4% of FESNov families, including 4,349 high-confidence associations with important biological processes. These predictions provide a valuable hypothesis-driven framework, which we employed to experimentally validate a new gene family involved in cell motility and a novel set of antimicrobial peptides. We also demonstrate that the relative abundance profiles of novel families can discriminate between environments and clinical conditions, leading to the discovery of potentially new biomarkers associated with colorectal cancer. We expect this work to enhance future metagenomics studies and expand our knowledge of the genetic repertory of uncultivated organisms.