Programa de Pós-Graduação em Ciências Farmacêuticas-CiPharma, Laboratório de Pesquisas Clínicas, Escola de Farmácia, Universidade Federal de Ouro Preto, Campus Morro do Cruzeiro, Ouro Preto, MG 35400-000, Brazil.
BMC Bioinformatics. 2012 Nov 21;13:309. doi: 10.1186/1471-2105-13-309.
Epitope prediction using computational methods represents one of the most promising approaches to vaccine development. Reduction of time, cost, and the availability of completely sequenced genomes are key points and highly motivating regarding the use of reverse vaccinology. Parasites of genus Leishmania are widely spread and they are the etiologic agents of leishmaniasis. Currently, there is no efficient vaccine against this pathogen and the drug treatment is highly toxic. The lack of sufficiently large datasets of experimentally validated parasites epitopes represents a serious limitation, especially for trypanomatids genomes. In this work we highlight the predictive performances of several algorithms that were evaluated through the development of a MySQL database built with the purpose of: a) evaluating individual algorithms prediction performances and their combination for CD8+ T cell epitopes, B-cell epitopes and subcellular localization by means of AUC (Area Under Curve) performance and a threshold dependent method that employs a confusion matrix; b) integrating data from experimentally validated and in silico predicted epitopes; and c) integrating the subcellular localization predictions and experimental data. NetCTL, NetMHC, BepiPred, BCPred12, and AAP12 algorithms were used for in silico epitope prediction and WoLF PSORT, Sigcleave and TargetP for in silico subcellular localization prediction against trypanosomatid genomes.
A database-driven epitope prediction method was developed with built-in functions that were capable of: a) removing experimental data redundancy; b) parsing algorithms predictions and storage experimental validated and predict data; and c) evaluating algorithm performances. Results show that a better performance is achieved when the combined prediction is considered. This is particularly true for B cell epitope predictors, where the combined prediction of AAP12 and BCPred12 reached an AUC value of 0.77. For T CD8+ epitope predictors, the combined prediction of NetCTL and NetMHC reached an AUC value of 0.64. Finally, regarding the subcellular localization prediction, the best performance is achieved when the combined prediction of Sigcleave, TargetP and WoLF PSORT is used.
Our study indicates that the combination of B cells epitope predictors is the best tool for predicting epitopes on protozoan parasites proteins. Regarding subcellular localization, the best result was obtained when the three algorithms predictions were combined. The developed pipeline is available upon request to authors.
使用计算方法进行表位预测是疫苗开发最有前途的方法之一。减少时间、成本和完全测序基因组的可用性是使用反向疫苗学的关键点和高度激励因素。利什曼原虫属的寄生虫分布广泛,是利什曼病的病原体。目前,针对这种病原体还没有有效的疫苗,而药物治疗毒性很大。缺乏经过充分验证的寄生虫表位的大型数据集是一个严重的限制,尤其是对于锥虫基因组。在这项工作中,我们强调了几种算法的预测性能,这些算法通过开发一个带有以下目的的 MySQL 数据库进行评估:a)通过 AUC(曲线下面积)性能和依赖于阈值的方法(该方法使用混淆矩阵)评估单个算法对 CD8+T 细胞表位、B 细胞表位和亚细胞定位的预测性能及其组合;b)整合来自实验验证和计算机预测表位的数据;c)整合亚细胞定位预测和实验数据。使用 NetCTL、NetMHC、BepiPred、BCPred12 和 AAP12 算法进行计算机预测表位,使用 WoLF PSORT、Sigcleave 和 TargetP 进行计算机预测亚细胞定位。
开发了一种基于数据库的表位预测方法,该方法具有内置功能,能够:a)消除实验数据冗余;b)解析算法预测并存储实验验证和预测数据;c)评估算法性能。结果表明,当考虑组合预测时,性能会更好。这对于 B 细胞表位预测器尤其如此,其中 AAP12 和 BCPred12 的组合预测达到了 0.77 的 AUC 值。对于 T CD8+表位预测器,NetCTL 和 NetMHC 的组合预测达到了 0.64 的 AUC 值。最后,关于亚细胞定位预测,当使用 Sigcleave、TargetP 和 WoLF PSORT 的组合预测时,会获得最佳性能。
我们的研究表明,组合 B 细胞表位预测器是预测原生动物寄生虫蛋白表位的最佳工具。关于亚细胞定位,当结合使用这三个算法的预测结果时,获得了最佳结果。请求作者可以提供开发的管道。