Zafar Aziz, Hou Chao, Amirani Naufa, Shen Yufeng
Department of Biomedical Informatics, Columbia University Irving Medical Center.
Department of Systems Biology, Columbia University Irving Medical Center.
bioRxiv. 2025 May 12:2025.05.07.652723. doi: 10.1101/2025.05.07.652723.
Predictive models for missense variant pathogenicity offer little functional interpretation for intrinsically disordered regions, since they rely on conservation and coevolution across homologous sequences. To understand the extent to which biophysics modulates model performance compared to genomic conservation, we model biophysics of IDRs explicitly for improved interpretation of variant effects. We develop MDmis, a method that uses biophysical features extracted from molecular dynamics (MD) simulations of IDRs to predict pathogenicity. We find that pathogenic variants in Long IDRs manifest differently, with transient order and depleted solvent access, compared to those in Short IDRs. Using MD simulations of sequences with single missense variants, we identify stronger evidence for pathogenic effects in Long IDRs compared to Short IDRs. MDmis, when combined with conservation information, achieves strong predictive accuracy of pathogenicity of variants in Long IDRs. Overall, extracting information from MD simulations can help understand the drivers of predictive performance and elucidate biophysical behaviors affected by pathogenic variants.
错义变异致病性的预测模型对内在无序区域几乎没有功能解释,因为它们依赖于同源序列间的保守性和共进化。为了了解与基因组保守性相比,生物物理学在多大程度上调节模型性能,我们明确地对内在无序区域的生物物理学进行建模,以改进对变异效应的解释。我们开发了MDmis,这是一种利用从内在无序区域的分子动力学(MD)模拟中提取的生物物理特征来预测致病性的方法。我们发现,与短内在无序区域中的致病性变异相比,长内在无序区域中的致病性变异表现不同,具有短暂的有序性和溶剂可及性降低。通过对具有单个错义变异的序列进行分子动力学模拟,我们发现与短内在无序区域相比,长内在无序区域中致病性效应的证据更强。当MDmis与保守性信息相结合时,它对长内在无序区域中变异的致病性具有很强的预测准确性。总体而言,从分子动力学模拟中提取信息有助于理解预测性能的驱动因素,并阐明受致病性变异影响的生物物理行为。