Zhang Chengxin, Zheng Wei, Cheng Micah, Omenn Gilbert S, Freddolino Peter L, Zhang Yang
Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, United States.
Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, Michigan 48109, United States.
J Proteome Res. 2021 Feb 5;20(2):1178-1189. doi: 10.1021/acs.jproteome.0c00359. Epub 2021 Jan 4.
When the JCVI-syn3.0 genome was designed and implemented in 2016 as the minimal genome of a free-living organism, approximately one-third of the 438 protein-coding genes had no known function. Subsequent refinement into JCVI-syn3A led to inclusion of 16 additional protein-coding genes, including several unknown functions, resulting in an improved growth phenotype. Here, we seek to unveil the biological roles and protein-protein interaction (PPI) networks for these poorly characterized proteins using state-of-the-art deep learning contact-assisted structure prediction, followed by structure-based annotation of functions and PPI predictions. Our pipeline is able to confidently assign functions for many previously unannotated proteins such as putative vitamin transporters, which suggest the importance of nutrient uptake even in a minimized genome. Remarkably, despite the artificial selection of genes in the minimal syn3 genome, our reconstructed PPI network still shows a power law distribution of node degrees typical of naturally evolved bacterial PPI networks. Making use of our framework for combined structure/function/interaction modeling, we are able to identify both fundamental aspects of network biology that are retained in a minimal proteome and additional essential functions not yet recognized among the poorly annotated components of the syn3.0 and syn3A proteomes.
2016年,当JCVI-syn3.0基因组作为自由生活生物体的最小基因组被设计并实现时,438个蛋白质编码基因中约有三分之一的功能未知。随后对JCVI-syn3A进行的优化导致增加了16个额外的蛋白质编码基因,其中包括几个功能未知的基因,从而改善了生长表型。在这里,我们试图利用最先进的深度学习接触辅助结构预测,揭示这些特征不明确的蛋白质的生物学作用和蛋白质-蛋白质相互作用(PPI)网络,随后基于结构进行功能注释和PPI预测。我们的流程能够自信地为许多以前未注释的蛋白质分配功能,如假定的维生素转运蛋白,这表明即使在最小化的基因组中,营养物质摄取也很重要。值得注意的是,尽管在最小化的syn3基因组中进行了基因人工选择,但我们重建的PPI网络仍然显示出自然进化的细菌PPI网络典型的节点度幂律分布。利用我们的结构/功能/相互作用联合建模框架,我们能够识别在最小蛋白质组中保留的网络生物学基本方面,以及在syn3.0和syn3A蛋白质组注释不充分的成分中尚未识别的其他基本功能。