MATHLA：一种整合双向 LSTM 和多头注意力机制的 HLA-肽结合预测稳健框架。

MATHLA: a robust framework for HLA-peptide binding prediction integrating bidirectional LSTM and multiple head attention mechanism.

机构信息

Shenzhen Neocura Biotechnology Co. Ltd., Shenzhen, 518055, China.

School of Computer Science and Technology, Heilongjiang University, Harbin, 150080, China.

出版信息

BMC Bioinformatics. 2021 Jan 6;22(1):7. doi: 10.1186/s12859-020-03946-z.

DOI:10.1186/s12859-020-03946-z

PMID:33407098

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7787246/

Abstract

BACKGROUND

Accurate prediction of binding between class I human leukocyte antigen (HLA) and neoepitope is critical for target identification within personalized T-cell based immunotherapy. Many recent prediction tools developed upon the deep learning algorithms and mass spectrometry data have indeed showed improvement on the average predicting power for class I HLA-peptide interaction. However, their prediction performances show great variability over individual HLA alleles and peptides with different lengths, which is particularly the case for HLA-C alleles due to the limited amount of experimental data. To meet the increasing demand for attaining the most accurate HLA-peptide binding prediction for individual patient in the real-world clinical studies, more advanced deep learning framework with higher prediction accuracy for HLA-C alleles and longer peptides is highly desirable.

RESULTS

We present a pan-allele HLA-peptide binding prediction framework-MATHLA which integrates bi-directional long short-term memory network and multiple head attention mechanism. This model achieves better prediction accuracy in both fivefold cross-validation test and independent test dataset. In addition, this model is superior over existing tools regarding to the prediction accuracy for longer ligand ranging from 11 to 15 amino acids. Moreover, our model also shows a significant improvement for HLA-C-peptide-binding prediction. By investigating multiple-head attention weight scores, we depicted possible interaction patterns between three HLA I supergroups and their cognate peptides.

CONCLUSION

Our method demonstrates the necessity of further development of deep learning algorithm in improving and interpreting HLA-peptide binding prediction in parallel to increasing the amount of high-quality HLA ligandome data.

摘要

背景

准确预测 I 类人类白细胞抗原（HLA）与新表位之间的结合对于个性化 T 细胞免疫治疗中的靶标识别至关重要。许多最近基于深度学习算法和质谱数据开发的预测工具确实提高了 I 类 HLA-肽相互作用的平均预测能力。然而，它们的预测性能在个体 HLA 等位基因和不同长度的肽之间表现出很大的可变性，对于 HLA-C 等位基因尤其如此，因为实验数据有限。为了满足在真实世界临床研究中为个体患者获得最准确 HLA-肽结合预测的需求，需要更先进的深度学习框架，以提高 HLA-C 等位基因和更长肽的预测准确性。

结果

我们提出了一个泛等位基因 HLA-肽结合预测框架-MATHLA，该框架集成了双向长短期记忆网络和多头注意力机制。该模型在五重交叉验证测试和独立测试数据集上都实现了更好的预测准确性。此外，与现有的工具相比，该模型在预测长度为 11 到 15 个氨基酸的更长配体方面具有更高的准确性。此外，我们的模型在 HLA-C-肽结合预测方面也有显著的改进。通过研究多头注意力权重分数，我们描绘了三个 HLA I 超组与其同源肽之间可能的相互作用模式。

结论

我们的方法证明了在增加高质量 HLA 配体组数据的同时，进一步开发深度学习算法对于改进和解释 HLA-肽结合预测的必要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c28c/7788970/fccfc6b0c6ab/12859_2020_3946_Fig1_HTML.jpg

相似文献

MATHLA: a robust framework for HLA-peptide binding prediction integrating bidirectional LSTM and multiple head attention mechanism.MATHLA：一种整合双向 LSTM 和多头注意力机制的 HLA-肽结合预测稳健框架。

BMC Bioinformatics. 2021 Jan 6;22(1):7. doi: 10.1186/s12859-020-03946-z.

A comprehensive review and performance evaluation of bioinformatics tools for HLA class I peptide-binding prediction.HLA 类 I 肽结合预测的生物信息学工具的综合评价与性能评估。

Brief Bioinform. 2020 Jul 15;21(4):1119-1135. doi: 10.1093/bib/bbz051.

Deep convolutional neural networks for pan-specific peptide-MHC class I binding prediction.用于 pan 特异性肽-MHC 类 I 结合预测的深度卷积神经网络。

BMC Bioinformatics. 2017 Dec 28;18(1):585. doi: 10.1186/s12859-017-1997-x.

Integrating peptides' sequence and energy of contact residues information improves prediction of peptide and HLA-I binding with unknown alleles.整合肽序列和接触残基能量信息可提高对未知等位基因的肽和 HLA-I 结合的预测。

BMC Bioinformatics. 2013;14 Suppl 8(Suppl 8):S1. doi: 10.1186/1471-2105-14-S8-S1. Epub 2013 May 9.

Toward the prediction of class I and II mouse major histocompatibility complex-peptide-binding affinity: in silico bioinformatic step-by-step guide using quantitative structure-activity relationships.迈向I类和II类小鼠主要组织相容性复合体-肽结合亲和力的预测：使用定量构效关系的计算机生物信息学逐步指南

Methods Mol Biol. 2007;409:227-45. doi: 10.1007/978-1-60327-118-9_16.

DeepSeqPanII: An Interpretable Recurrent Neural Network Model With Attention Mechanism for Peptide-HLA Class II Binding Prediction.DeepSeqPanII：一种具有注意力机制的可解释递归神经网络模型，用于肽-HLA Ⅱ类结合预测。

IEEE/ACM Trans Comput Biol Bioinform. 2022 Jul-Aug;19(4):2188-2196. doi: 10.1109/TCBB.2021.3074927. Epub 2022 Aug 8.

DeepSeqPan, a novel deep convolutional neural network model for pan-specific class I HLA-peptide binding affinity prediction.DeepSeqPan，一种新的深度卷积神经网络模型，用于 pan 特异性 class I HLA-肽结合亲和力预测。

Sci Rep. 2019 Jan 28;9(1):794. doi: 10.1038/s41598-018-37214-1.

APEX-pHLA: A novel method for accurate prediction of the binding between exogenous short peptides and HLA class I molecules.APEX-pHLA：一种用于准确预测外源性短肽与 HLA Ⅰ类分子结合的新方法。

Methods. 2024 Aug;228:38-47. doi: 10.1016/j.ymeth.2024.05.013. Epub 2024 May 19.

DeepNetBim: deep learning model for predicting HLA-epitope interactions based on network analysis by harnessing binding and immunogenicity information.DeepNetBim：一种基于网络分析的深度学习模型，通过利用结合和免疫原性信息来预测 HLA-表位相互作用。

BMC Bioinformatics. 2021 May 5;22(1):231. doi: 10.1186/s12859-021-04155-y.

PromPDD, a web-based tool for the prediction, deciphering and design of promiscuous peptides that bind to HLA class I molecules.PromPDD，一个基于网络的工具，用于预测、破译和设计与 HLA Ⅰ类分子结合的杂乱肽。

J Immunol Methods. 2020 Jan;476:112685. doi: 10.1016/j.jim.2019.112685. Epub 2019 Oct 31.

引用本文的文献

AI-driven epitope prediction: a system review, comparative analysis, and practical guide for vaccine development.人工智能驱动的表位预测：疫苗开发的系统综述、比较分析及实用指南

NPJ Vaccines. 2025 Aug 30;10(1):207. doi: 10.1038/s41541-025-01258-y.

Predicting MHC-I ligands across alleles and species: how far can we go?跨等位基因和物种预测主要组织相容性复合体I类配体：我们能走多远？

Genome Med. 2025 Mar 20;17(1):25. doi: 10.1186/s13073-025-01450-8.

The analysis of credit governance in the digital economy development under artificial neural networks.人工神经网络下数字经济发展中的信用治理分析

Heliyon. 2024 Oct 11;10(20):e39286. doi: 10.1016/j.heliyon.2024.e39286. eCollection 2024 Oct 30.

Transformers meets neoantigen detection: a systematic literature review.变压器与新抗原检测：系统文献综述。

J Integr Bioinform. 2024 Jul 4;21(2). doi: 10.1515/jib-2023-0043. eCollection 2024 Jun 1.

Artificial intelligence and neoantigens: paving the path for precision cancer immunotherapy.人工智能与新抗原：为精准癌症免疫治疗铺平道路。

Front Immunol. 2024 May 29;15:1394003. doi: 10.3389/fimmu.2024.1394003. eCollection 2024.

Learning peptide properties with positive examples only.仅通过正例学习肽的特性。

Digit Discov. 2024 Apr 19;3(5):977-986. doi: 10.1039/d3dd00218g. eCollection 2024 May 15.

TripHLApan: predicting HLA molecules binding peptides based on triple coding matrix and transfer learning.TripHLApan：基于三码矩阵和迁移学习预测 HLA 分子结合肽。

Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae154.

HLAEquity: Examining biases in pan-allele peptide-HLA binding predictors.HLA公平性：审视泛等位基因肽-HLA结合预测因子中的偏差。

iScience. 2023 Dec 2;27(1):108613. doi: 10.1016/j.isci.2023.108613. eCollection 2024 Jan 19.

Comput Struct Biotechnol J. 2023 Oct 31;21:5538-5543. doi: 10.1016/j.csbj.2023.10.050. eCollection 2023.

DeepHLAPred: a deep learning-based method for non-classical HLA binder prediction.DeepHLAPred：一种基于深度学习的非经典 HLA 结合物预测方法。

BMC Genomics. 2023 Nov 23;24(1):706. doi: 10.1186/s12864-023-09796-2.

本文引用的文献

A large peptidome dataset improves HLA class I epitope prediction across most of the human population.一个大型的肽组数据集提高了 HLA Ⅰ类抗原表位预测在大多数人群中的性能。

Nat Biotechnol. 2020 Feb;38(2):199-209. doi: 10.1038/s41587-019-0322-9. Epub 2019 Dec 16.

Predicting HLA class II antigen presentation through integrated deep learning.通过集成深度学习预测 HLA Ⅱ类抗原呈递

Nat Biotechnol. 2019 Nov;37(11):1332-1343. doi: 10.1038/s41587-019-0280-2. Epub 2019 Oct 14.

ACME: pan-specific peptide-MHC class I binding prediction through attention-based deep neural networks.ACME：基于注意力的深度神经网络的泛肽-MHC Ⅰ类结合预测。

Bioinformatics. 2019 Dec 1;35(23):4946-4954. doi: 10.1093/bioinformatics/btz427.

Performance Evaluation of MHC Class-I Binding Prediction Tools Based on an Experimentally Validated MHC-Peptide Binding Data Set.基于实验验证的 MHC-肽结合数据集的 MHC 类 I 结合预测工具的性能评估。

Cancer Immunol Res. 2019 May;7(5):719-736. doi: 10.1158/2326-6066.CIR-18-0584. Epub 2019 Mar 22.

The Length Distribution and Multiple Specificity of Naturally Presented HLA-I Ligands.天然存在的 HLA-I 配体的长度分布和多重特异性。

J Immunol. 2018 Dec 15;201(12):3705-3716. doi: 10.4049/jimmunol.1800914. Epub 2018 Nov 14.

The Immune Epitope Database (IEDB): 2018 update.免疫表位数据库（IEDB）：2018 年更新。

Nucleic Acids Res. 2019 Jan 8;47(D1):D339-D343. doi: 10.1093/nar/gky1006.

MHCflurry: Open-Source Class I MHC Binding Affinity Prediction.MHCflurry：开源的 I 类 MHC 结合亲和力预测。

Cell Syst. 2018 Jul 25;7(1):129-132.e4. doi: 10.1016/j.cels.2018.05.014. Epub 2018 Jun 27.

The SysteMHC Atlas project.SysteMHC Atlas 项目。

Nucleic Acids Res. 2018 Jan 4;46(D1):D1237-D1247. doi: 10.1093/nar/gkx664.

NetMHCpan-4.0: Improved Peptide-MHC Class I Interaction Predictions Integrating Eluted Ligand and Peptide Binding Affinity Data.NetMHCpan-4.0：整合洗脱配体和肽结合亲和力数据的改进的肽与主要组织相容性复合体I类相互作用预测

J Immunol. 2017 Nov 1;199(9):3360-3368. doi: 10.4049/jimmunol.1700893. Epub 2017 Oct 4.

Deciphering HLA-I motifs across HLA peptidomes improves neo-antigen predictions and identifies allostery regulating HLA specificity.解析HLA肽组中的HLA-I基序可改善新抗原预测并识别调节HLA特异性的变构现象。

PLoS Comput Biol. 2017 Aug 23;13(8):e1005725. doi: 10.1371/journal.pcbi.1005725. eCollection 2017 Aug.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

MATHLA：一种整合双向 LSTM 和多头注意力机制的 HLA-肽结合预测稳健框架。

MATHLA: a robust framework for HLA-peptide binding prediction integrating bidirectional LSTM and multiple head attention mechanism.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献