Izumi Hiroshi, Nafie Laurence A, Dukor Rina K
National Institute of Advanced Industrial Science and Technology (AIST), AIST Tsukuba West, 16-1 Onogawa, Tsukuba, Ibaraki 305-8569, Japan.
Department of Chemistry, Syracuse University, Syracuse, New York 13244-4100, United States.
ACS Omega. 2020 Nov 19;5(47):30556-30567. doi: 10.1021/acsomega.0c04472. eCollection 2020 Dec 1.
Amino acid mutations that improve protein stability and rigidity can accompany increases in binding affinity. Therefore, conserved amino acids located on a protein surface may be successfully targeted by antibodies. The quantitative deep mutational scanning approach is an excellent technique to understand viral evolution, and the obtained data can be utilized to develop a vaccine. However, the application of the approach to all of the proteins in general is difficult in terms of cost. To address this need, we report the construction of a deep neural network-based program for sequence-based prediction of supersecondary structure codes (SSSCs), called SSSCPrediction (SSSCPred). Further, to predict conformational flexibility or rigidity in proteins, a comparison program called SSSCPreds that consists of three deep neural network-based prediction systems (SSSCPred, SSSCPred100, and SSSCPred200) has also been developed. Using our algorithms we calculated here shows the degree of flexibility for the receptor-binding motif of SARS-CoV-2 spike protein and the rigidity of the unique motif (SSSC: SSSHSSHHHH) at the S2 subunit and has a value independent of the X-ray and Cryo-EM structures. The fact that the sequence flexibility/rigidity map of SARS-CoV-2 RBD resembles the sequence-to-phenotype maps of ACE2-binding affinity and expression, which were experimentally obtained by deep mutational scanning, suggests that the identical SSSC sequences among the ones predicted by three deep neural network-based systems correlate well with the sequences with both lower ACE2-binding affinity and lower expression. The combined analysis of predicted and observed SSSCs with keyword-tagged datasets would be helpful in understanding the structural correlation to the examined system.
提高蛋白质稳定性和刚性的氨基酸突变可能伴随着结合亲和力的增加。因此,位于蛋白质表面的保守氨基酸可能会成功地成为抗体的靶点。定量深度突变扫描方法是理解病毒进化的一项出色技术,所获得的数据可用于开发疫苗。然而,从成本角度来看,将该方法应用于所有蛋白质总体上是困难的。为满足这一需求,我们报告构建了一个基于深度神经网络的程序,用于基于序列预测超二级结构编码(SSSCs),称为SSSCPrediction(SSSCPred)。此外,为了预测蛋白质的构象灵活性或刚性,还开发了一个名为SSSCPreds的比较程序,它由三个基于深度神经网络的预测系统(SSSCPred、SSSCPred100和SSSCPred200)组成。我们在此使用算法计算得出了新冠病毒刺突蛋白受体结合基序的灵活性程度以及S2亚基独特基序(SSSC:SSSSHSSHHHH)的刚性,其值独立于X射线和冷冻电镜结构。新冠病毒RBD的序列灵活性/刚性图谱类似于通过深度突变扫描实验获得的ACE2结合亲和力和表达的序列到表型图谱,这表明基于三个深度神经网络系统预测的相同SSSC序列与具有较低ACE2结合亲和力和较低表达的序列相关性良好。将预测的和观察到的SSSCs与关键词标记的数据集进行综合分析,将有助于理解与所研究系统的结构相关性。