Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
Nature. 2018 May;557(7706):503-509. doi: 10.1038/s41586-018-0124-0. Epub 2018 May 16.
One-third of all protein-coding genes from bacterial genomes cannot be annotated with a function. Here, to investigate the functions of these genes, we present genome-wide mutant fitness data from 32 diverse bacteria across dozens of growth conditions. We identified mutant phenotypes for 11,779 protein-coding genes that had not been annotated with a specific function. Many genes could be associated with a specific condition because the gene affected fitness only in that condition, or with another gene in the same bacterium because they had similar mutant phenotypes. Of the poorly annotated genes, 2,316 had associations that have high confidence because they are conserved in other bacteria. By combining these conserved associations with comparative genomics, we identified putative DNA repair proteins; in addition, we propose specific functions for poorly annotated enzymes and transporters and for uncharacterized protein families. Our study demonstrates the scalability of microbial genetics and its utility for improving gene annotations.
三分之一的细菌基因组中的蛋白质编码基因无法被注释为具有特定功能。在这里,为了研究这些基因的功能,我们提供了来自 32 种不同细菌的全基因组突变体适应度数据,涵盖了数十种生长条件。我们确定了 11779 个未被注释为特定功能的蛋白质编码基因的突变表型。许多基因可以与特定条件相关联,因为这些基因仅在该条件下影响适应度,或者与同一细菌中的另一个基因相关联,因为它们具有相似的突变表型。在这些注释较差的基因中,有 2316 个关联具有较高的置信度,因为它们在其他细菌中是保守的。通过将这些保守的关联与比较基因组学相结合,我们鉴定了可能的 DNA 修复蛋白;此外,我们还为注释较差的酶和转运蛋白以及未表征的蛋白质家族提出了特定的功能。我们的研究表明,微生物遗传学具有可扩展性,并且可用于改善基因注释。