使用经过微调的蛋白质结构预测网络进行肽结合特异性预测。

Peptide-binding specificity prediction using fine-tuned protein structure prediction networks.

机构信息

Department of Biochemistry, University of Washington, Seattle, WA 98195.

Institute for Protein Design, University of Washington, Seattle, WA 98195.

出版信息

Proc Natl Acad Sci U S A. 2023 Feb 28;120(9):e2216697120. doi: 10.1073/pnas.2216697120. Epub 2023 Feb 21.

DOI:10.1073/pnas.2216697120

PMID:36802421

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9992841/

Abstract

Peptide-binding proteins play key roles in biology, and predicting their binding specificity is a long-standing challenge. While considerable protein structural information is available, the most successful current methods use sequence information alone, in part because it has been a challenge to model the subtle structural changes accompanying sequence substitutions. Protein structure prediction networks such as AlphaFold model sequence-structure relationships very accurately, and we reasoned that if it were possible to specifically train such networks on binding data, more generalizable models could be created. We show that placing a classifier on top of the AlphaFold network and fine-tuning the combined network parameters for both classification and structure prediction accuracy leads to a model with strong generalizable performance on a wide range of Class I and Class II peptide-MHC interactions that approaches the overall performance of the state-of-the-art NetMHCpan sequence-based method. The peptide-MHC optimized model shows excellent performance in distinguishing binding and non-binding peptides to SH3 and PDZ domains. This ability to generalize well beyond the training set far exceeds that of sequence-only models and should be particularly powerful for systems where less experimental data are available.

摘要

肽结合蛋白在生物学中发挥着关键作用，预测它们的结合特异性是一个长期存在的挑战。虽然已经有相当多的蛋白质结构信息，但目前最成功的方法仅使用序列信息，部分原因是很难对伴随序列取代的细微结构变化进行建模。像 AlphaFold 这样的蛋白质结构预测网络非常准确地模拟了序列-结构关系，我们推断，如果有可能专门针对结合数据对这些网络进行训练，那么可以创建更具通用性的模型。我们表明，在 AlphaFold 网络之上放置一个分类器，并针对分类和结构预测准确性微调组合网络参数，可导致在广泛的 I 类和 II 类肽-MHC 相互作用上具有强大的可推广性能的模型，其整体性能接近最先进的基于 NetMHCpan 序列方法的性能。经过肽-MHC 优化的模型在区分 SH3 和 PDZ 结构域的结合肽和非结合肽方面表现出优异的性能。这种能够很好地推广到训练集之外的能力远远超过了仅基于序列的模型，对于那些实验数据较少的系统尤其具有强大的作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/64be/9992841/8891c46de420/pnas.2216697120fig01.jpg

相似文献

Peptide-binding specificity prediction using fine-tuned protein structure prediction networks.使用经过微调的蛋白质结构预测网络进行肽结合特异性预测。

Proc Natl Acad Sci U S A. 2023 Feb 28;120(9):e2216697120. doi: 10.1073/pnas.2216697120. Epub 2023 Feb 21.

Accurate modeling of peptide-MHC structures with AlphaFold.使用 AlphaFold 对肽-MHC 结构进行精确建模。

Structure. 2024 Feb 1;32(2):228-241.e4. doi: 10.1016/j.str.2023.11.011. Epub 2023 Dec 18.

Towards universal structure-based prediction of class II MHC epitopes for diverse allotypes.朝向基于结构的 II 类 MHC 表位针对多种同种异型的普遍预测。

PLoS One. 2010 Dec 20;5(12):e14383. doi: 10.1371/journal.pone.0014383.

NN-align. An artificial neural network-based alignment algorithm for MHC class II peptide binding prediction.NN-align. 一种基于人工神经网络的 MHC Ⅱ类肽结合预测的对齐算法。

BMC Bioinformatics. 2009 Sep 18;10:296. doi: 10.1186/1471-2105-10-296.

Systematically benchmarking peptide-MHC binding predictors: From synthetic to naturally processed epitopes.系统地对肽-MHC 结合预测因子进行基准测试：从合成到天然加工的表位。

PLoS Comput Biol. 2018 Nov 8;14(11):e1006457. doi: 10.1371/journal.pcbi.1006457. eCollection 2018 Nov.

Improved pan-specific MHC class I peptide-binding predictions using a novel representation of the MHC-binding cleft environment.使用MHC结合裂隙环境的新表示方法改进泛特异性MHC I类肽结合预测。

Tissue Antigens. 2014 Feb;83(2):94-100. doi: 10.1111/tan.12292.

Toward the prediction of class I and II mouse major histocompatibility complex-peptide-binding affinity: in silico bioinformatic step-by-step guide using quantitative structure-activity relationships.迈向I类和II类小鼠主要组织相容性复合体-肽结合亲和力的预测：使用定量构效关系的计算机生物信息学逐步指南

Methods Mol Biol. 2007;409:227-45. doi: 10.1007/978-1-60327-118-9_16.

MHC-Fine: Fine-tuned AlphaFold for precise MHC-peptide complex prediction.MHC-Fine：针对精确的 MHC-肽复合物预测进行优化调整的 AlphaFold。

Biophys J. 2024 Sep 3;123(17):2902-2909. doi: 10.1016/j.bpj.2024.05.011. Epub 2024 May 15.

Prediction of MHC class II binding affinity using SMM-align, a novel stabilization matrix alignment method.使用SMM-align（一种新型稳定矩阵比对方法）预测MHC II类分子结合亲和力。

BMC Bioinformatics. 2007 Jul 4;8:238. doi: 10.1186/1471-2105-8-238.

MultiRTA: a simple yet reliable method for predicting peptide binding affinities for multiple class II MHC allotypes.MultiRTA：一种简单而可靠的方法，用于预测多种 II 类 MHC 同种异型的肽结合亲和力。

BMC Bioinformatics. 2010 Sep 24;11:482. doi: 10.1186/1471-2105-11-482.

引用本文的文献

AI-driven epitope prediction: a system review, comparative analysis, and practical guide for vaccine development.人工智能驱动的表位预测：疫苗开发的系统综述、比较分析及实用指南

NPJ Vaccines. 2025 Aug 30;10(1):207. doi: 10.1038/s41541-025-01258-y.

NetTCR-struc, a structure driven approach for prediction of TCR-pMHC interactions.NetTCR-struc，一种用于预测TCR与pMHC相互作用的结构驱动方法。

Front Immunol. 2025 Jul 17;16:1616328. doi: 10.3389/fimmu.2025.1616328. eCollection 2025.

HLAIIPred: cross-attention mechanism for modeling the interaction of HLA class II molecules with peptides.HLAIIPred：用于模拟HLA II类分子与肽相互作用的交叉注意力机制。

Commun Biol. 2025 Jul 30;8(1):1133. doi: 10.1038/s42003-025-08500-2.

An improved model for prediction of de novo designed proteins with diverse geometries.一种用于预测具有不同几何形状的从头设计蛋白质的改进模型。

bioRxiv. 2025 Jun 6:2025.06.02.657515. doi: 10.1101/2025.06.02.657515.

A functionally validated TCR-pMHC database for TCR specificity model development.一个用于TCR特异性模型开发的功能验证的TCR-pMHC数据库。

bioRxiv. 2025 May 12:2025.04.28.651095. doi: 10.1101/2025.04.28.651095.

Protein Sequence Analysis landscape: A Systematic Review of Task Types, Databases, Datasets, Word Embeddings Methods, and Language Models.蛋白质序列分析全景：任务类型、数据库、数据集、词嵌入方法和语言模型的系统综述

Database (Oxford). 2025 May 30;2025. doi: 10.1093/database/baaf027.

Evaluation of AlphaFold modeling for elucidation of nanobody-peptide epitope interactions.用于阐明纳米抗体 - 肽表位相互作用的AlphaFold建模评估。

J Biol Chem. 2025 May 21;301(7):110268. doi: 10.1016/j.jbc.2025.110268.

Unveiling the influence of fastest nobel prize winner discovery: alphafold's algorithmic intelligence in medical sciences.揭示最快诺贝尔奖获得者发现的影响：阿尔法折叠在医学科学中的算法智能。

J Mol Model. 2025 May 19;31(6):163. doi: 10.1007/s00894-025-06392-x.

Recent progress and future challenges in structure-based protein-protein interaction prediction.基于结构的蛋白质-蛋白质相互作用预测的最新进展与未来挑战

Mol Ther. 2025 May 7;33(5):2252-2268. doi: 10.1016/j.ymthe.2025.04.003. Epub 2025 Apr 6.

Advances of computational methods enhance the development of multi-epitope vaccines.计算方法的进步推动了多表位疫苗的发展。

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf055.

本文引用的文献

Improving de novo protein binder design with deep learning.利用深度学习改进从头设计的蛋白质结合物。

Nat Commun. 2023 May 6;14(1):2625. doi: 10.1038/s41467-023-38328-5.

Structural Prediction of Peptide-MHC Binding Modes.肽-MHC 结合模式的结构预测。

Methods Mol Biol. 2022;2405:245-282. doi: 10.1007/978-1-0716-1855-4_13.

Computed structures of core eukaryotic protein complexes.核心真核蛋白复合物的计算结构。

Science. 2021 Dec 10;374(6573):eabm4805. doi: 10.1126/science.abm4805.

Applying and improving AlphaFold at CASP14.应用和改进 AlphaFold 参加 CASP14。

Proteins. 2021 Dec;89(12):1711-1721. doi: 10.1002/prot.26257.

Accurate prediction of protein structures and interactions using a three-track neural network.使用三轨神经网络准确预测蛋白质结构和相互作用。

Science. 2021 Aug 20;373(6557):871-876. doi: 10.1126/science.abj8754. Epub 2021 Jul 15.

Highly accurate protein structure prediction with AlphaFold.利用 AlphaFold 进行高精度蛋白质结构预测。

Nature. 2021 Aug;596(7873):583-589. doi: 10.1038/s41586-021-03819-2. Epub 2021 Jul 15.

Deep learning pan-specific model for interpretable MHC-I peptide binding prediction with improved attention mechanism.基于改进注意力机制的可解释 MHC-I 肽结合预测深度学习泛型模型。

Proteins. 2021 Jul;89(7):866-883. doi: 10.1002/prot.26065. Epub 2021 Mar 18.

Large-scale survey and database of high affinity ligands for peptide recognition modules.大规模调查和高亲和力配体肽识别模块数据库。

Mol Syst Biol. 2020 Dec;16(12):e9310. doi: 10.15252/msb.20199310.

MHCAttnNet: predicting MHC-peptide bindings for MHC alleles classes I and II using an attention-based deep neural model.MHCAttnNet：使用基于注意力的深度神经网络模型预测 MHC 等位基因 I 类和 II 类与肽段的结合

Bioinformatics. 2020 Jul 1;36(Suppl_1):i399-i406. doi: 10.1093/bioinformatics/btaa479.

NetMHCpan-4.1 and NetMHCIIpan-4.0: improved predictions of MHC antigen presentation by concurrent motif deconvolution and integration of MS MHC eluted ligand data.NetMHCpan-4.1 和 NetMHCIIpan-4.0：通过同时对基序进行分解以及整合 MS MHC 洗脱配体数据，改进了 MHC 抗原呈递的预测。

Nucleic Acids Res. 2020 Jul 2;48(W1):W449-W454. doi: 10.1093/nar/gkaa379.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用经过微调的蛋白质结构预测网络进行肽结合特异性预测。

Peptide-binding specificity prediction using fine-tuned protein structure prediction networks.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献