Barozi Victor, Chakraborty Shrestha, Govender Shaylyn, Morgan Emily, Ramahala Rabelani, Graham Stephen C, Bishop Nigel T, Tastan Bishop Özlem
Research Unit in Bioinformatics (RUBi), Department of Biochemistry, Microbiology and Bioinformatics, Rhodes University, Makhanda 6139, South Africa.
Division of Virology, Department of Pathology, University of Cambridge, Cambridge CB2 1QP, UK.
Comput Struct Biotechnol J. 2024 Oct 22;23:3800-3816. doi: 10.1016/j.csbj.2024.10.031. eCollection 2024 Dec.
Deciphering the effect of evolutionary mutations of viruses and predicting future mutations is crucial for designing long-lasting and effective drugs. While understanding the impact of current mutations on protein drug targets is feasible, predicting future mutations due to natural evolution of viruses and environmental pressures remains challenging. Here, we leveraged existing mutation data during the evolution of the SARS-CoV-2 protein drug target main protease (M) to test the predictive power of dynamic residue network (DRN) analysis in identifying mutation cold and hot spots. We conducted molecular dynamics simulations on the M of SARS-CoV-2 (Wuhan strain) and calculated eight DRN metrics (, , , , , , ), each of which identifies a unique network feature within the protein. The sets of residues with the highest and lowest values for each metric, comprising potential cold and hot spots, were compared to published biochemical analyses and per residue mutation frequencies observed across five SARS-CoV-2 lineages, encompassing a total of 191,878 sequences. Individual DRN metrics displayed only modest power to predict the mutation frequency of individual residues. However, integrating the eight DRN metrics with additional structural and sequence-derived metrics allowed us to develop machine learning models which significantly improved the prediction of residue mutation frequency. While further refinements should enhance accuracy, we demonstrated a robust method to understand pathogen evolution. This approach can also guide the development of long-lasting drugs by targeting functional residues located in and near active site, and allosteric sites, that are less prone to mutations.
破译病毒进化突变的影响并预测未来突变对于设计长效且有效的药物至关重要。虽然了解当前突变对蛋白质药物靶点的影响是可行的,但预测由于病毒自然进化和环境压力导致的未来突变仍然具有挑战性。在此,我们利用严重急性呼吸综合征冠状病毒2(SARS-CoV-2)蛋白质药物靶点主要蛋白酶(M)进化过程中的现有突变数据,来测试动态残基网络(DRN)分析在识别突变冷点和热点方面的预测能力。我们对SARS-CoV-2(武汉株)的M进行了分子动力学模拟,并计算了八个DRN指标(,,,,,,),每个指标都能识别蛋白质内独特的网络特征。将每个指标中具有最高和最低值的残基集(包括潜在的冷点和热点)与已发表的生化分析以及在五个SARS-CoV-2谱系中观察到的每个残基的突变频率进行比较,这些谱系总共包含191,878个序列。单个DRN指标在预测单个残基的突变频率方面仅显示出适度的能力。然而,将这八个DRN指标与其他结构和序列衍生指标相结合,使我们能够开发出机器学习模型,显著提高了对残基突变频率的预测。虽然进一步的改进应能提高准确性,但我们展示了一种理解病原体进化的可靠方法。这种方法还可以通过靶向位于活性位点及其附近以及变构位点中不易发生突变的功能残基来指导长效药物的开发。