Suppr超能文献

微调蛋白质语言模型以理解错义变体的功能影响。

Fine-tuning protein language models to understand the functional impact of missense variants.

作者信息

Saadat Ali, Fellay Jacques

机构信息

School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland.

Swiss Institute of Bioinformatics, Lausanne, Switzerland.

出版信息

Comput Struct Biotechnol J. 2025 May 28;27:2199-2207. doi: 10.1016/j.csbj.2025.05.022. eCollection 2025.

Abstract

Elucidating the functional effects of missense variants is crucial yet challenging. To investigate their impact, we fine-tuned protein language models, including ESM2 and ProtT5, to classify 20 protein features at amino acid resolution. In addition, we trained a fully connected neural network classifier on frozen embeddings and compared its performance to fine-tuning in order to quantify the added value of task-specific adaptation. We then used the fine-tuned models to: 1) identify protein features enriched in either pathogenic or benign missense variants, and 2) compare the predicted feature profiles of proteins with reference and alternate alleles to understand how missense variants affect protein functionality. We show that our models can be used to reclassify variants of uncertain significance and provide mechanistic insights into the functional consequences of missense mutations.

摘要

阐明错义变体的功能影响至关重要但具有挑战性。为了研究它们的影响,我们对包括ESM2和ProtT5在内的蛋白质语言模型进行了微调,以在氨基酸分辨率下对20种蛋白质特征进行分类。此外,我们在冻结的嵌入上训练了一个全连接神经网络分类器,并将其性能与微调进行比较,以量化特定任务适应的附加值。然后,我们使用微调后的模型来:1)识别在致病性或良性错义变体中富集的蛋白质特征,以及2)比较具有参考等位基因和替代等位基因的蛋白质的预测特征谱,以了解错义变体如何影响蛋白质功能。我们表明,我们的模型可用于重新分类意义不确定的变体,并对错义突变的功能后果提供机制性见解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e405/12166733/4ba41b147d80/gr001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验