深度学习工具预测无序区域变异的敏感性较低。

Deep learning tools predict variants in disordered regions with lower sensitivity.

作者信息

Luppino Federica, Lenz Swantje, Chow Chi Fung Willis, Toth-Petroczy Agnes

机构信息

Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstrasse 108, 01307, Dresden, Germany.

Center for Systems Biology Dresden, Pfotenhauerstrasse 108, 01307, Dresden, Germany.

出版信息

BMC Genomics. 2025 Apr 12;26(1):367. doi: 10.1186/s12864-025-11534-9.

DOI:10.1186/s12864-025-11534-9

PMID:40221640

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11992697/

Abstract

BACKGROUND

The recent AI breakthrough of AlphaFold2 has revolutionized 3D protein structural modeling, proving crucial for protein design and variant effects prediction. However, intrinsically disordered regions-known for their lack of well-defined structure and lower sequence conservation-often yield low-confidence models. The latest Variant Effect Predictor (VEP), AlphaMissense, leverages AlphaFold2 models, achieving over 90% sensitivity and specificity in predicting variant effects. However, the effectiveness of tools for variants in disordered regions, which account for 30% of the human proteome, remains unclear.

RESULTS

In this study, we found that predicting pathogenicity for variants in disordered regions is less accurate than in ordered regions, particularly for mutations at the first N-Methionine site. Investigations into the efficacy of variant effect predictors on intrinsically disordered regions (IDRs) indicated that mutations in IDRs are predicted with lower sensitivity and the gap between sensitivity and specificity is largest in disordered regions, especially for AlphaMissense and VARITY.

CONCLUSIONS

The prevalence of IDRs within the human proteome, coupled with the increasing repertoire of biological functions they are known to perform, necessitated an investigation into the efficacy of state-of-the-art VEPs on such regions. This analysis revealed their consistently reduced sensitivity and differing prediction performance profile to ordered regions, indicating that new IDR-specific features and paradigms are needed to accurately classify disease mutations within those regions.

摘要

背景

最近AlphaFold2在人工智能方面的突破彻底改变了3D蛋白质结构建模，这对蛋白质设计和变异效应预测至关重要。然而，内在无序区域因其缺乏明确的结构和较低的序列保守性，往往会产生低置信度的模型。最新的变异效应预测器（VEP）AlphaMissense利用AlphaFold2模型，在预测变异效应方面实现了超过90%的灵敏度和特异性。然而，对于占人类蛋白质组30%的无序区域中的变异，工具的有效性仍不明确。

结果

在本研究中，我们发现预测无序区域中变异的致病性比有序区域更不准确，特别是对于第一个N-甲硫氨酸位点的突变。对变异效应预测器在内在无序区域（IDR）上的功效进行的研究表明，IDR中的突变预测灵敏度较低，且在无序区域中灵敏度与特异性之间的差距最大，尤其是对于AlphaMissense和VARITY。

结论

人类蛋白质组中IDR的普遍性，以及已知它们执行的生物功能种类不断增加，使得有必要研究最先进的VEP在此类区域上的功效。该分析揭示了它们在无序区域中灵敏度持续降低，且预测性能与有序区域不同，这表明需要新的IDR特异性特征和范式来准确分类这些区域内的疾病突变。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c69c/11992697/cf113f75cfde/12864_2025_11534_Fig1_HTML.jpg

相似文献

Deep learning tools predict variants in disordered regions with lower sensitivity.深度学习工具预测无序区域变异的敏感性较低。

BMC Genomics. 2025 Apr 12;26(1):367. doi: 10.1186/s12864-025-11534-9.

AlphaFold and Implications for Intrinsically Disordered Proteins.AlphaFold 及其对无序蛋白质的影响。

J Mol Biol. 2021 Oct 1;433(20):167208. doi: 10.1016/j.jmb.2021.167208. Epub 2021 Aug 18.

A Comprehensive Survey of the Roles of Highly Disordered Proteins in Type 2 Diabetes.高度紊乱蛋白质在 2 型糖尿病中作用的全面综述。

Int J Mol Sci. 2017 Sep 21;18(10):2010. doi: 10.3390/ijms18102010.

Conformational ensembles of the human intrinsically disordered proteome.人类内在无序蛋白质组的构象集合

Nature. 2024 Feb;626(8000):897-904. doi: 10.1038/s41586-023-07004-5. Epub 2024 Jan 31.

Insights into the evolutionary forces that shape the codon usage in the viral genome segments encoding intrinsically disordered protein regions.深入了解塑造编码内在无序蛋白区域的病毒基因组片段中密码子使用偏好的进化力量。

Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab145.

Predicting Protein-Protein Interfaces that Bind Intrinsically Disordered Protein Regions.预测与无序蛋白区域结合的蛋白-蛋白界面。

J Mol Biol. 2019 Aug 9;431(17):3157-3178. doi: 10.1016/j.jmb.2019.06.010. Epub 2019 Jun 15.

Genomic Analysis of Intrinsically Disordered Proteins in the Genus .属内无序蛋白质的基因组分析。

Int J Mol Sci. 2020 Jun 3;21(11):4010. doi: 10.3390/ijms21114010.

Systematic identification of conditionally folded intrinsically disordered regions by AlphaFold2.利用 AlphaFold2 系统识别条件折叠的固有无序区域。

Proc Natl Acad Sci U S A. 2023 Oct 31;120(44):e2304302120. doi: 10.1073/pnas.2304302120. Epub 2023 Oct 25.

Towards Decoding the Sequence-Based Grammar Governing the Functions of Intrinsically Disordered Protein Regions.探索基于序列的语法，以揭示无规则蛋白区域功能的奥秘。

J Mol Biol. 2021 Jun 11;433(12):166724. doi: 10.1016/j.jmb.2020.11.023. Epub 2020 Nov 26.

A subset of functional adaptation mutations alter propensity for α-helical conformation in the intrinsically disordered glucocorticoid receptor tau1core activation domain.一组功能适应性突变改变了内在无序的糖皮质激素受体 tau1 核心激活结构域中 α-螺旋构象的倾向性。

Biochim Biophys Acta Gen Subj. 2018 Jun;1862(6):1452-1461. doi: 10.1016/j.bbagen.2018.03.015. Epub 2018 Mar 14.

引用本文的文献

Assessing variant effect predictors and disease mechanisms in intrinsically disordered proteins.评估内在无序蛋白质中的变异效应预测因子和疾病机制。

PLoS Comput Biol. 2025 Aug 19;21(8):e1013400. doi: 10.1371/journal.pcbi.1013400. eCollection 2025 Aug.

Molecular dynamics simulations of intrinsically disordered protein regions enable biophysical interpretation of variant effect predictors.内在无序蛋白质区域的分子动力学模拟能够对变异效应预测因子进行生物物理解释。

bioRxiv. 2025 May 12:2025.05.07.652723. doi: 10.1101/2025.05.07.652723.

本文引用的文献

SHARK-capture identifies functional motifs in intrinsically disordered protein regions.SHARK-capture可识别内在无序蛋白质区域中的功能基序。

Protein Sci. 2025 Apr;34(4):e70091. doi: 10.1002/pro.70091.

Mis-splicing of a neuronal microexon promotes CPEB4 aggregation in ASD.神经元微小外显子的错误剪接促进了自闭症谱系障碍中CPEB4的聚集。

Nature. 2025 Jan;637(8045):496-503. doi: 10.1038/s41586-024-08289-w. Epub 2024 Dec 4.

InterPro: the protein sequence classification resource in 2025.InterPro：2025年的蛋白质序列分类资源。

Nucleic Acids Res. 2025 Jan 6;53(D1):D444-D456. doi: 10.1093/nar/gkae1082.

SHARK enables sensitive detection of evolutionary homologs and functional analogs in unalignable and disordered sequences.SHARK 能够在不可比对和无序序列中灵敏地检测进化同源物和功能类似物。

Proc Natl Acad Sci U S A. 2024 Oct 15;121(42):e2401622121. doi: 10.1073/pnas.2401622121. Epub 2024 Oct 9.

AIUPred: combining energy estimation with deep learning for the enhanced prediction of protein disorder.AIUPred：将能量估计与深度学习相结合，以增强对蛋白质无序性的预测。

Nucleic Acids Res. 2024 Jul 5;52(W1):W176-W181. doi: 10.1093/nar/gkae385.

Analysis of AlphaMissense data in different protein groups and structural context.分析不同蛋白质组和结构背景下的 AlphaMissense 数据。

Sci Data. 2024 May 14;11(1):495. doi: 10.1038/s41597-024-03327-8.

Accurate structure prediction of biomolecular interactions with AlphaFold 3.利用 AlphaFold 3 进行生物分子相互作用的精确结构预测。

Nature. 2024 Jun;630(8016):493-500. doi: 10.1038/s41586-024-07487-w. Epub 2024 May 8.

Conformational ensembles of the human intrinsically disordered proteome.人类内在无序蛋白质组的构象集合

Nature. 2024 Feb;626(8000):897-904. doi: 10.1038/s41586-023-07004-5. Epub 2024 Jan 31.

Systematic identification of conditionally folded intrinsically disordered regions by AlphaFold2.利用 AlphaFold2 系统识别条件折叠的固有无序区域。

Proc Natl Acad Sci U S A. 2023 Oct 31;120(44):e2304302120. doi: 10.1073/pnas.2304302120. Epub 2023 Oct 25.

Accurate proteome-wide missense variant effect prediction with AlphaMissense.使用 AlphaMissense 进行精确的全蛋白质错义变异效应预测。

Science. 2023 Sep 22;381(6664):eadg7492. doi: 10.1126/science.adg7492.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

深度学习工具预测无序区域变异的敏感性较低。

Deep learning tools predict variants in disordered regions with lower sensitivity.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献