Computational Structural Genomics Unit, Linda T. and John A. Mellowes Center for Genomics Sciences and Precision Medicine, Medical College of Wisconsin, Milwaukee, WI 53226, USA.
Data Science Institute, Medical College of Wisconsin, Milwaukee, WI 53226, USA.
Int J Mol Sci. 2024 Nov 8;25(22):12018. doi: 10.3390/ijms252212018.
Clinical genomics sequencing is rapidly expanding the number of variants that need to be functionally elucidated. Interpreting genetic variants (i.e., mutations) usually begins by identifying how they affect protein-coding sequences. Still, the three-dimensional (3D) protein molecule is rarely considered for large-scale variant analysis, nor in analyses of how proteins interact with each other and their environment. We propose a standardized approach to scoring protein surface property changes as a new dimension for functionally and mechanistically interpreting genomic variants. Further, it directs hypothesis generation for functional genomics research to learn more about the encoded protein's function. We developed a novel method leveraging 3D structures and time-dependent simulations to score and statistically evaluate protein surface property changes. We evaluated positive controls composed of eight thermophilic versus mesophilic orthologs and variants that experimentally change the protein's solubility, which all showed large and statistically significant differences in charge distribution ( < 0.01). We scored static 3D structures and dynamic ensembles for 43 independent variants (23 pathogenic and 20 uninterpreted) across four proteins. Focusing on the potassium ion channel, KCNK9, the average local surface potential shifts were 0.41 kT/ec with an average -value of 1 × 10. In contrast, dynamic ensemble shifts averaged 1.15 kT/ec with an average -value of 1 × 10, enabling the identification of changes far from mutated sites. This study demonstrates that an objective assessment of how mutations affect electrostatic distributions of protein surfaces can aid in interpreting genomic variants discovered through clinical genomic sequencing.
临床基因组测序正在迅速增加需要功能阐明的变异数量。解释遗传变异(即突变)通常首先要确定它们如何影响蛋白质编码序列。然而,在大规模变异分析中,很少考虑三维(3D)蛋白质分子,也很少考虑蛋白质之间以及蛋白质与环境相互作用的情况。我们提出了一种标准化方法,用于对蛋白质表面性质变化进行评分,将其作为功能和机制解释基因组变异的新维度。此外,它为功能基因组学研究提供了假设生成,以更多地了解编码蛋白的功能。我们开发了一种新方法,利用 3D 结构和时变模拟对蛋白质表面性质变化进行评分和统计评估。我们评估了由八个嗜热与嗜中性同源物组成的阳性对照以及实验改变蛋白质溶解度的变异,所有这些变异在电荷分布方面都显示出大且具有统计学意义的差异(<0.01)。我们对四个蛋白质中的 43 个独立变异(23 个致病性和 20 个未解释)的静态 3D 结构和动态集合进行了评分。以钾离子通道 KCNK9 为例,局部表面电势的平均变化为 0.41 kT/ec,平均 - 值为 1×10。相比之下,动态集合的平均变化为 1.15 kT/ec,平均 - 值为 1×10,能够识别远离突变位点的变化。这项研究表明,客观评估突变如何影响蛋白质表面的静电分布可以帮助解释通过临床基因组测序发现的基因组变异。