• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

比勒陀利亚:一种用于准确且高通量鉴定真核病原体CD8 T细胞表位的有效计算方法。

Pretoria: An effective computational approach for accurate and high-throughput identification of CD8 t-cell epitopes of eukaryotic pathogens.

作者信息

Charoenkwan Phasit, Schaduangrat Nalini, Pham Nhat Truong, Manavalan Balachandran, Shoombuatong Watshara

机构信息

Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai 50200, Thailand.

Center for Research Innovation and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand.

出版信息

Int J Biol Macromol. 2023 May 31;238:124228. doi: 10.1016/j.ijbiomac.2023.124228. Epub 2023 Mar 29.

DOI:10.1016/j.ijbiomac.2023.124228
PMID:36996953
Abstract

T-cells recognize antigenic epitopes present on major histocompatibility complex (MHC) molecules, triggering an adaptive immune response in the host. T-cell epitope (TCE) identification is challenging because of the extensive number of undetermined proteins found in eukaryotic pathogens, as well as MHC polymorphisms. In addition, conventional experimental approaches for TCE identification are time-consuming and expensive. Thus, computational approaches that can accurately and rapidly identify CD8 T-cell epitopes (TCEs) of eukaryotic pathogens based solely on sequence information may facilitate the discovery of novel CD8 TCEs in a cost-effective manner. Here, Pretoria (Predictor of CD8 TCEs of eukaryotic pathogens) is proposed as the first stack-based approach for accurate and large-scale identification of CD8 TCEs of eukaryotic pathogens. In particular, Pretoria enabled the extraction and exploration of crucial information embedded in CD8 TCEs by employing a comprehensive set of 12 well-known feature descriptors extracted from multiple groups, including physicochemical properties, composition-transition-distribution, pseudo-amino acid composition, and amino acid composition. These feature descriptors were then utilized to construct a pool of 144 different machine learning (ML)-based classifiers based on 12 popular ML algorithms. Finally, the feature selection method was used to effectively determine the important ML classifiers for the construction of our stacked model. The experimental results indicated that Pretoria is an accurate and effective computational approach for CD8 TCE prediction; it was superior to several conventional ML classifiers and the existing method in terms of the independent test, with an accuracy of 0.866, MCC of 0.732, and AUC of 0.921. Additionally, to maximize user convenience for high-throughput identification of CD8 TCEs of eukaryotic pathogens, a user-friendly web server of Pretoria (http://pmlabstack.pythonanywhere.com/Pretoria) was developed and made freely available.

摘要

T细胞识别主要组织相容性复合体(MHC)分子上呈现的抗原表位,从而触发宿主的适应性免疫反应。由于真核病原体中存在大量未确定的蛋白质以及MHC多态性,T细胞表位(TCE)的鉴定具有挑战性。此外,用于TCE鉴定的传统实验方法既耗时又昂贵。因此,仅基于序列信息就能准确快速地鉴定真核病原体CD8 T细胞表位(TCE)的计算方法,可能会以经济高效的方式促进新型CD8 TCE的发现。在此,提出了比勒陀利亚方法(真核病原体CD8 TCE预测器),作为第一种基于堆叠的方法,用于准确大规模鉴定真核病原体的CD8 TCE。特别是,比勒陀利亚方法通过采用从多个组中提取的12组全面的著名特征描述符,包括物理化学性质、组成-转换-分布、伪氨基酸组成和氨基酸组成,实现了对CD8 TCE中嵌入的关键信息的提取和探索。然后,利用这些特征描述符基于12种流行的机器学习(ML)算法构建了144个不同的基于ML的分类器库。最后,使用特征选择方法有效地确定用于构建我们的堆叠模型的重要ML分类器。实验结果表明,比勒陀利亚方法是一种用于CD8 TCE预测的准确有效的计算方法;在独立测试方面,它优于几种传统的ML分类器和现有方法,准确率为0.866,马修斯相关系数为0.732,曲线下面积为0.921。此外,为了最大程度地方便用户高通量鉴定真核病原体的CD8 TCE,开发了一个用户友好的比勒陀利亚网络服务器(http://pmlabstack.pythonanywhere.com/Pretoria)并免费提供。

相似文献

1
Pretoria: An effective computational approach for accurate and high-throughput identification of CD8 t-cell epitopes of eukaryotic pathogens.比勒陀利亚:一种用于准确且高通量鉴定真核病原体CD8 T细胞表位的有效计算方法。
Int J Biol Macromol. 2023 May 31;238:124228. doi: 10.1016/j.ijbiomac.2023.124228. Epub 2023 Mar 29.
2
CD8TCEI-EukPath: A Novel Predictor to Rapidly Identify CD8 T-Cell Epitopes of Eukaryotic Pathogens Using a Hybrid Feature Selection Approach.CD8TCEI-EukPath:一种使用混合特征选择方法快速识别真核病原体CD8 T细胞表位的新型预测工具。
Front Genet. 2022 Jul 22;13:935989. doi: 10.3389/fgene.2022.935989. eCollection 2022.
3
TROLLOPE: A novel sequence-based stacked approach for the accelerated discovery of linear T-cell epitopes of hepatitis C virus.特罗洛普:一种基于新型序列的堆叠方法,用于加速发现丙型肝炎病毒的线性 T 细胞表位。
PLoS One. 2023 Aug 25;18(8):e0290538. doi: 10.1371/journal.pone.0290538. eCollection 2023.
4
Decision tree based ensemble machine learning model for the prediction of Zika virus T-cell epitopes as potential vaccine candidates.基于决策树的集成机器学习模型用于预测寨卡病毒 T 细胞表位作为潜在疫苗候选物。
Sci Rep. 2022 May 12;12(1):7810. doi: 10.1038/s41598-022-11731-6.
5
StackTTCA: a stacking ensemble learning-based framework for accurate and high-throughput identification of tumor T cell antigens.StackTTCA:一种基于堆叠集成学习的框架,用于准确、高通量地鉴定肿瘤 T 细胞抗原。
BMC Bioinformatics. 2023 Jul 28;24(1):301. doi: 10.1186/s12859-023-05421-x.
6
Enhancing in silico protein-based vaccine discovery for eukaryotic pathogens using predicted peptide-MHC binding and peptide conservation scores.利用预测的肽-MHC结合和肽保守性评分增强针对真核病原体的基于计算机蛋白质的疫苗发现。
PLoS One. 2014 Dec 29;9(12):e115745. doi: 10.1371/journal.pone.0115745. eCollection 2014.
7
StackDPPIV: A novel computational approach for accurate prediction of dipeptidyl peptidase IV (DPP-IV) inhibitory peptides.StackDPPIV:一种用于准确预测二肽基肽酶 IV(DPP-IV)抑制肽的新型计算方法。
Methods. 2022 Aug;204:189-198. doi: 10.1016/j.ymeth.2021.12.001. Epub 2021 Dec 6.
8
A robust deep learning workflow to predict CD8 + T-cell epitopes.一种强大的深度学习工作流程,用于预测 CD8+T 细胞表位。
Genome Med. 2023 Sep 13;15(1):70. doi: 10.1186/s13073-023-01225-z.
9
AMYPred-FRL is a novel approach for accurate prediction of amyloid proteins by using feature representation learning.AMYPred-FRL 是一种通过使用特征表示学习来准确预测淀粉样蛋白的新方法。
Sci Rep. 2022 May 11;12(1):7697. doi: 10.1038/s41598-022-11897-z.
10
Accelerating the identification of the allergenic potential of plant proteins using a stacked ensemble-learning framework.使用堆叠集成学习框架加速植物蛋白致敏潜力的鉴定。
J Biomol Struct Dyn. 2024 Feb 22:1-13. doi: 10.1080/07391102.2024.2318482.

引用本文的文献

1
Advancing the accuracy of clathrin protein prediction through multi-source protein language models.通过多源蛋白质语言模型提高网格蛋白蛋白质预测的准确性。
Sci Rep. 2025 Jul 8;15(1):24403. doi: 10.1038/s41598-025-08510-4.
2
BGATT-GR: accurate identification of glucocorticoid receptor antagonists based on data augmentation combined with BiGRU-attention.BGATT-GR:基于数据增强结合双向门控循环单元-注意力机制的糖皮质激素受体拮抗剂准确识别
Sci Rep. 2025 Jul 1;15(1):21402. doi: 10.1038/s41598-025-05839-8.
3
GRU4ACE: Enhancing ACE inhibitory peptide prediction by integrating gated recurrent unit with multi-source feature embeddings.
GRU4ACE:通过将门控循环单元与多源特征嵌入相结合来增强血管紧张素转换酶抑制肽预测
Protein Sci. 2025 Jun;34(6):e70026. doi: 10.1002/pro.70026.
4
Transformer-based deep learning enables improved B-cell epitope prediction in parasitic pathogens: A proof-of-concept study on Fasciola hepatica.基于Transformer的深度学习可改善对寄生性病原体中B细胞表位的预测:对肝片吸虫的概念验证研究
PLoS Negl Trop Dis. 2025 Apr 29;19(4):e0012985. doi: 10.1371/journal.pntd.0012985. eCollection 2025 Apr.
5
Conotoxins: Classification, Prediction, and Future Directions in Bioinformatics.芋螺毒素:生物信息学中的分类、预测及未来方向
Toxins (Basel). 2025 Feb 9;17(2):78. doi: 10.3390/toxins17020078.
6
ac4C-AFL: A high-precision identification of human mRNA N4-acetylcytidine sites based on adaptive feature representation learning.ac4C-AFL:基于自适应特征表示学习的人类mRNA N4-乙酰胞苷位点的高精度识别
Mol Ther Nucleic Acids. 2024 Apr 24;35(2):102192. doi: 10.1016/j.omtn.2024.102192. eCollection 2024 Jun 11.