• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

HLAB:从 ProtBert 编码的蛋白质中学习 BiLSTM 特征,用于预测 I 类 HLA-肽结合。

HLAB: learning the BiLSTM features from the ProtBert-encoded proteins for the class I HLA-peptide binding prediction.

机构信息

School of Biology & Engineering, Guizhou Medical University, Guiyang, Guizhou 550004, P.R. China.

College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, P.R. China.

出版信息

Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac173.

DOI:10.1093/bib/bbac173
PMID:35514183
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9487590/
Abstract

Human Leukocyte Antigen (HLA) is a type of molecule residing on the surfaces of most human cells and exerts an essential role in the immune system responding to the invasive items. The T cell antigen receptors may recognize the HLA-peptide complexes on the surfaces of cancer cells and destroy these cancer cells through toxic T lymphocytes. The computational determination of HLA-binding peptides will facilitate the rapid development of cancer immunotherapies. This study hypothesized that the natural language processing-encoded peptide features may be further enriched by another deep neural network. The hypothesis was tested with the Bi-directional Long Short-Term Memory-extracted features from the pretrained Protein Bidirectional Encoder Representations from Transformers-encoded features of the class I HLA (HLA-I)-binding peptides. The experimental data showed that our proposed HLAB feature engineering algorithm outperformed the existing ones in detecting the HLA-I-binding peptides. The extensive evaluation data show that the proposed HLAB algorithm outperforms all the seven existing studies on predicting the peptides binding to the HLA-A*01:01 allele in AUC and achieves the best average AUC values on the six out of the seven k-mers (k=8,9,...,14, respectively represent the prediction task of a polypeptide consisting of k amino acids) except for the 9-mer prediction tasks. The source code and the fine-tuned feature extraction models are available at http://www.healthinformaticslab.org/supp/resources.php.

摘要

人类白细胞抗原 (HLA) 是一种存在于大多数人体细胞表面的分子,在免疫系统对入侵物的反应中发挥着重要作用。T 细胞抗原受体可以识别癌细胞表面的 HLA-肽复合物,并通过毒性 T 淋巴细胞破坏这些癌细胞。HLA 结合肽的计算确定将促进癌症免疫疗法的快速发展。本研究假设,自然语言处理编码的肽特征可以通过另一个深度神经网络进一步丰富。该假设通过使用双向长短期记忆提取的特征和预训练的蛋白质双向编码器表示从变压器编码的 HLA(HLA-I)结合肽的特征进行了测试。实验数据表明,我们提出的 HLA 特征工程算法在检测 HLA-I 结合肽方面优于现有算法。广泛的评估数据表明,与预测 HLA-A*01:01 等位基因结合肽的七种现有研究相比,所提出的 HLA 算法在 AUC 方面表现更好,并在除 9 -mer 预测任务外的六个 k-mer(k=8、9、...、14 分别代表由 k 个氨基酸组成的多肽的预测任务)中达到了最佳平均 AUC 值。源代码和微调的特征提取模型可在 http://www.healthinformaticslab.org/supp/resources.php 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bf1/9487590/35a17e8b8ec7/bbac173f13.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bf1/9487590/d0263e53bc0b/bbac173f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bf1/9487590/eaabfa2c1e7a/bbac173f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bf1/9487590/72beab708c6f/bbac173f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bf1/9487590/695308347b29/bbac173f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bf1/9487590/fda219a267c9/bbac173f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bf1/9487590/d7a01662a039/bbac173f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bf1/9487590/7714c9b02717/bbac173f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bf1/9487590/a5d57b6c9110/bbac173f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bf1/9487590/d646a0f27780/bbac173f9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bf1/9487590/d860a3898d8d/bbac173f10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bf1/9487590/55ad12728ba2/bbac173f11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bf1/9487590/cbe4b02cdeea/bbac173f12.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bf1/9487590/35a17e8b8ec7/bbac173f13.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bf1/9487590/d0263e53bc0b/bbac173f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bf1/9487590/eaabfa2c1e7a/bbac173f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bf1/9487590/72beab708c6f/bbac173f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bf1/9487590/695308347b29/bbac173f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bf1/9487590/fda219a267c9/bbac173f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bf1/9487590/d7a01662a039/bbac173f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bf1/9487590/7714c9b02717/bbac173f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bf1/9487590/a5d57b6c9110/bbac173f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bf1/9487590/d646a0f27780/bbac173f9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bf1/9487590/d860a3898d8d/bbac173f10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bf1/9487590/55ad12728ba2/bbac173f11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bf1/9487590/cbe4b02cdeea/bbac173f12.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bf1/9487590/35a17e8b8ec7/bbac173f13.jpg

相似文献

1
HLAB: learning the BiLSTM features from the ProtBert-encoded proteins for the class I HLA-peptide binding prediction.HLAB:从 ProtBert 编码的蛋白质中学习 BiLSTM 特征,用于预测 I 类 HLA-肽结合。
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac173.
2
APEX-pHLA: A novel method for accurate prediction of the binding between exogenous short peptides and HLA class I molecules.APEX-pHLA:一种用于准确预测外源性短肽与 HLA Ⅰ类分子结合的新方法。
Methods. 2024 Aug;228:38-47. doi: 10.1016/j.ymeth.2024.05.013. Epub 2024 May 19.
3
MATHLA: a robust framework for HLA-peptide binding prediction integrating bidirectional LSTM and multiple head attention mechanism.MATHLA:一种整合双向 LSTM 和多头注意力机制的 HLA-肽结合预测稳健框架。
BMC Bioinformatics. 2021 Jan 6;22(1):7. doi: 10.1186/s12859-020-03946-z.
4
Integrating peptides' sequence and energy of contact residues information improves prediction of peptide and HLA-I binding with unknown alleles.整合肽序列和接触残基能量信息可提高对未知等位基因的肽和 HLA-I 结合的预测。
BMC Bioinformatics. 2013;14 Suppl 8(Suppl 8):S1. doi: 10.1186/1471-2105-14-S8-S1. Epub 2013 May 9.
5
DeepSeqPanII: An Interpretable Recurrent Neural Network Model With Attention Mechanism for Peptide-HLA Class II Binding Prediction.DeepSeqPanII:一种具有注意力机制的可解释递归神经网络模型,用于肽-HLA Ⅱ类结合预测。
IEEE/ACM Trans Comput Biol Bioinform. 2022 Jul-Aug;19(4):2188-2196. doi: 10.1109/TCBB.2021.3074927. Epub 2022 Aug 8.
6
DeepNetBim: deep learning model for predicting HLA-epitope interactions based on network analysis by harnessing binding and immunogenicity information.DeepNetBim:一种基于网络分析的深度学习模型,通过利用结合和免疫原性信息来预测 HLA-表位相互作用。
BMC Bioinformatics. 2021 May 5;22(1):231. doi: 10.1186/s12859-021-04155-y.
7
DeepSeqPan, a novel deep convolutional neural network model for pan-specific class I HLA-peptide binding affinity prediction.DeepSeqPan,一种新的深度卷积神经网络模型,用于 pan 特异性 class I HLA-肽结合亲和力预测。
Sci Rep. 2019 Jan 28;9(1):794. doi: 10.1038/s41598-018-37214-1.
8
Sequence conservation analysis and in silico human leukocyte antigen-peptide binding predictions for the Mtb72F and M72 tuberculosis candidate vaccine antigens.结核分枝杆菌72F(Mtb72F)和M72结核候选疫苗抗原的序列保守性分析及计算机模拟人白细胞抗原-肽结合预测
BMC Immunol. 2015 Oct 22;16:63. doi: 10.1186/s12865-015-0119-7.
9
Toward the prediction of class I and II mouse major histocompatibility complex-peptide-binding affinity: in silico bioinformatic step-by-step guide using quantitative structure-activity relationships.迈向I类和II类小鼠主要组织相容性复合体-肽结合亲和力的预测:使用定量构效关系的计算机生物信息学逐步指南
Methods Mol Biol. 2007;409:227-45. doi: 10.1007/978-1-60327-118-9_16.
10
A comprehensive review and performance evaluation of bioinformatics tools for HLA class I peptide-binding prediction.HLA 类 I 肽结合预测的生物信息学工具的综合评价与性能评估。
Brief Bioinform. 2020 Jul 15;21(4):1119-1135. doi: 10.1093/bib/bbz051.

引用本文的文献

1
Bridging artificial intelligence and biological sciences: a comprehensive review of large language models in bioinformatics.连接人工智能与生物科学:生物信息学中大型语言模型的全面综述
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf357.
2
Fine-Tuning Protein Language Models Unlocks the Potential of Underrepresented Viral Proteomes.微调蛋白质语言模型可释放未充分表征的病毒蛋白质组的潜力。
bioRxiv. 2025 Jun 11:2025.04.17.649224. doi: 10.1101/2025.04.17.649224.
3
Computation strategies and clinical applications in neoantigen discovery towards precision cancer immunotherapy.
精准癌症免疫治疗新抗原发现中的计算策略与临床应用
Biomark Res. 2025 Jul 9;13(1):96. doi: 10.1186/s40364-025-00808-9.
4
Identifying the DNA methylation preference of transcription factors using ProtBERT and SVM.使用ProtBERT和支持向量机识别转录因子的DNA甲基化偏好性。
PLoS Comput Biol. 2025 May 13;21(5):e1012513. doi: 10.1371/journal.pcbi.1012513. eCollection 2025 May.
5
iMFP-LG: Identify Novel Multi-functional Peptides Using Protein Language Models and Graph-based Deep Learning.iMFP-LG:使用蛋白质语言模型和基于图的深度学习识别新型多功能肽。
Genomics Proteomics Bioinformatics. 2025 Jan 15;22(6). doi: 10.1093/gpbjnl/qzae084.
6
GASIDN: identification of sub-Golgi proteins with multi-scale feature fusion.GASIDN:具有多尺度特征融合的亚高尔基体蛋白鉴定。
BMC Genomics. 2024 Oct 30;25(1):1019. doi: 10.1186/s12864-024-10954-3.
7
PEZy-miner: An artificial intelligence driven approach for the discovery of plastic-degrading enzyme candidates.PEZy-矿工:一种用于发现塑料降解酶候选物的人工智能驱动方法。
Metab Eng Commun. 2024 Sep 5;19:e00248. doi: 10.1016/j.mec.2024.e00248. eCollection 2024 Dec.
8
Transformers meets neoantigen detection: a systematic literature review.变压器与新抗原检测:系统文献综述。
J Integr Bioinform. 2024 Jul 4;21(2). doi: 10.1515/jib-2023-0043. eCollection 2024 Jun 1.
9
Artificial intelligence and neoantigens: paving the path for precision cancer immunotherapy.人工智能与新抗原:为精准癌症免疫治疗铺平道路。
Front Immunol. 2024 May 29;15:1394003. doi: 10.3389/fimmu.2024.1394003. eCollection 2024.
10
ACPPfel: Explainable deep ensemble learning for anticancer peptides prediction based on feature optimization.ACPPfel:基于特征优化的可解释深度集成学习用于抗癌肽预测
Front Genet. 2024 Feb 29;15:1352504. doi: 10.3389/fgene.2024.1352504. eCollection 2024.