Lubin Joseph H, Zardecki Christine, Dolan Elliott M, Lu Changpeng, Shen Zhuofan, Dutta Shuchismita, Westbrook John D, Hudson Brian P, Goodsell David S, Williams Jonathan K, Voigt Maria, Sarma Vidur, Xie Lingjun, Venkatachalam Thejasvi, Arnold Steven, Alvarado Luz Helena Alfaro, Catalfano Kevin, Khan Aaliyah, McCarthy Erika, Staggers Sophia, Tinsley Brea, Trudeau Alan, Singh Jitendra, Whitmore Lindsey, Zheng Helen, Benedek Matthew, Currier Jenna, Dresel Mark, Duvvuru Ashish, Dyszel Britney, Fingar Emily, Hennen Elizabeth M, Kirsch Michael, Khan Ali A, Labrie-Cleary Charlotte, Laporte Stephanie, Lenkeit Evan, Martin Kailey, Orellana Marilyn, de la Campa Melanie Ortiz-Alvarez, Paredes Isaac, Wheeler Baleigh, Rupert Allison, Sam Andrew, See Katherine, Zapata Santiago Soto, Craig Paul A, Hall Bonnie L, Jiang Jennifer, Koeppe Julia R, Mills Stephen A, Pikaart Michael J, Roberts Rebecca, Bromberg Yana, Hoyer J Steen, Duffy Siobain, Tischfield Jay, Ruiz Francesc X, Arnold Eddy, Baum Jean, Sandberg Jesse, Brannigan Grace, Khare Sagar D, Burley Stephen K
bioRxiv. 2020 Dec 7:2020.12.01.406637. doi: 10.1101/2020.12.01.406637.
Three-dimensional structures of SARS-CoV-2 and other coronaviral proteins archived in the Protein Data Bank were used to analyze viral proteome evolution during the first six months of the COVID-19 pandemic. Analyses of spatial locations, chemical properties, and structural and energetic impacts of the observed amino acid changes in >48,000 viral proteome sequences showed how each one of the 29 viral study proteins have undergone amino acid changes. Structural models computed for every unique sequence variant revealed that most substitutions map to protein surfaces and boundary layers with a minority affecting hydrophobic cores. Conservative changes were observed more frequently in cores boundary layers/surfaces. Active sites and protein-protein interfaces showed modest numbers of substitutions. Energetics calculations showed that the impact of substitutions on the thermodynamic stability of the proteome follows a universal bi-Gaussian distribution. Detailed results are presented for six drug discovery targets and four structural proteins comprising the virion, highlighting substitutions with the potential to impact protein structure, enzyme activity, and functional interfaces. Characterizing the evolution of the virus in three dimensions provides testable insights into viral protein function and should aid in structure-based drug discovery efforts as well as the prospective identification of amino acid substitutions with potential for drug resistance.
利用蛋白质数据库中存档的严重急性呼吸综合征冠状病毒 2(SARS-CoV-2)及其他冠状病毒蛋白的三维结构,分析了2019冠状病毒病(COVID-19)大流行头六个月期间病毒蛋白质组的进化情况。对超过48000个病毒蛋白质组序列中观察到的氨基酸变化的空间位置、化学性质以及结构和能量影响进行分析,结果显示了29种病毒研究蛋白中的每一种是如何发生氨基酸变化的。为每个独特的序列变体计算的结构模型表明,大多数替换映射到蛋白质表面和边界层,少数影响疏水核心。在核心、边界层/表面更频繁地观察到保守变化。活性位点和蛋白质-蛋白质界面显示出适度数量的替换。能量学计算表明,替换对蛋白质组热力学稳定性的影响遵循普遍的双高斯分布。文中给出了六个药物发现靶点和构成病毒粒子的四种结构蛋白的详细结果,突出了可能影响蛋白质结构、酶活性和功能界面的替换。在三维空间中表征病毒的进化,为病毒蛋白功能提供了可检验的见解,并应有助于基于结构的药物发现工作以及对具有耐药性潜力的氨基酸替换进行前瞻性识别。