Brunello Franco Gino, Erra Lorenzo, Nicola Juan, Martí Marcelo Adrián
Departamento de Química Biológica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires (FCEyN-UBA) e Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales (IQUIBICEN) CONICET, Pabellón 2 de Ciudad Universitaria, Ciudad de Buenos Aires, Argentina.
Departamento de Bioquímica Clínica, Facultad de Ciencias Químicas, Universidad Nacional de Córdoba. Centro de Investigaciones en Bioquímica Clínica e Inmunología - Consejo Nacional de Investigaciones Científicas y Técnicas (CIBICI-CONICET), Córdoba, Argentina.
PLoS Comput Biol. 2025 Aug 4;21(8):e1012829. doi: 10.1371/journal.pcbi.1012829. eCollection 2025 Aug.
Short Linear Motifs (SLiMs) are protein functionally relevant regions that mediate reversible protein-protein interactions. Variants that disrupt SLiMs can lead to numerous Mendelian diseases. Although various bioinformatic tools have been developed to identify SLiMs, most suffer from low specificity. In our previous work, we demonstrated that integrating sequence variant information with structural analysis can enhance the prediction of true functional SLiMs while simultaneously generating tolerance matrices that indicate whether each of the 19 possible single amino acid substitutions (SASs) is tolerated. However, the scarcity of representative crystallographic structures of SLiM-receptor complexes posed a significant limitation. In this study, we demonstrate that these interactions can be modeled using AlphaFold2 (AF2) to generate high-quality structures that serve as input for our MotSASi method. These AF2-derived structures show robust performance, both in reproducing known structures deposited in the Protein Data Bank (PDB) and in reflecting the deleterious effects of known sequence variants. This updated version of MotSASi expands the repertoire of high-confidence predicted SLiMs and provides a comprehensive catalog of variants located within SLiMs, along with their respective deleteriousness assessments. When compared to AlphaMissense, MotSASi demonstrates superior performance in predicting variant deleteriousness. By contributing to the accurate identification and interpretation of variants, this work aligns with ACMG/AMP standards and aims to improve diagnostic rates in clinical genomics.
短线性基序(SLiMs)是介导可逆蛋白质-蛋白质相互作用的蛋白质功能相关区域。破坏SLiMs的变体可导致多种孟德尔疾病。尽管已经开发了各种生物信息学工具来识别SLiMs,但大多数工具的特异性较低。在我们之前的工作中,我们证明将序列变异信息与结构分析相结合可以提高对真正功能性SLiMs的预测,同时生成耐受性矩阵,表明19种可能的单氨基酸替换(SASs)中的每一种是否被耐受。然而,SLiM-受体复合物代表性晶体结构的稀缺构成了重大限制。在本研究中,我们证明这些相互作用可以使用AlphaFold2(AF2)进行建模,以生成高质量结构,作为我们的MotSASi方法的输入。这些源自AF2的结构在重现蛋白质数据库(PDB)中 deposited的已知结构以及反映已知序列变异的有害影响方面均表现出强大的性能。MotSASi的这个更新版本扩展了高置信度预测SLiMs的范围,并提供了位于SLiMs内的变体的综合目录,以及它们各自的有害性评估。与AlphaMissense相比,MotSASi在预测变体有害性方面表现出卓越的性能。通过有助于准确识别和解释变体,这项工作符合ACMG/AMP标准,旨在提高临床基因组学的诊断率。