Suppr超能文献

EpitopeVec:使用深度蛋白质序列嵌入进行线性表位预测。

EpitopeVec: linear epitope prediction using deep protein sequence embeddings.

机构信息

Computational Biology of Infection Research, Helmholtz Center for Infection Research, Braunschweig 38124, Germany.

Braunschweig Integrated Center of Systems Biology (BRICS), Technische Universität Braunschweig, Braunschweig 38106, Germany.

出版信息

Bioinformatics. 2021 Dec 7;37(23):4517-4525. doi: 10.1093/bioinformatics/btab467.

Abstract

MOTIVATION

B-cell epitopes (BCEs) play a pivotal role in the development of peptide vaccines, immuno-diagnostic reagents and antibody production, and thus in infectious disease prevention and diagnostics in general. Experimental methods used to determine BCEs are costly and time-consuming. Therefore, it is essential to develop computational methods for the rapid identification of BCEs. Although several computational methods have been developed for this task, generalizability is still a major concern, where cross-testing of the classifiers trained and tested on different datasets has revealed accuracies of 51-53%.

RESULTS

We describe a new method called EpitopeVec, which uses a combination of residue properties, modified antigenicity scales, and protein language model-based representations (protein vectors) as features of peptides for linear BCE predictions. Extensive benchmarking of EpitopeVec and other state-of-the-art methods for linear BCE prediction on several large and small datasets, as well as cross-testing, demonstrated an improvement in the performance of EpitopeVec over other methods in terms of accuracy and area under the curve. As the predictive performance depended on the species origin of the respective antigens (viral, bacterial and eukaryotic), we also trained our method on a large viral dataset to create a dedicated linear viral BCE predictor with improved cross-testing performance.

AVAILABILITY AND IMPLEMENTATION

The software is available at https://github.com/hzi-bifo/epitope-prediction.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

B 细胞表位(BCEs)在肽疫苗、免疫诊断试剂和抗体生产的开发中起着关键作用,因此在传染病的预防和诊断中也起着重要作用。用于确定 BCEs 的实验方法既昂贵又耗时。因此,开发用于快速识别 BCEs 的计算方法至关重要。尽管已经开发了几种用于此任务的计算方法,但可泛化性仍然是一个主要关注点,其中在不同数据集上训练和测试的分类器的交叉测试显示出 51-53%的准确率。

结果

我们描述了一种新的方法,称为 EpitopeVec,它使用残基特性、改良的抗原性量表以及基于蛋白质语言模型的表示(蛋白质向量)的组合作为肽的特征,用于线性 BCE 预测。在几个大型和小型数据集上对 EpitopeVec 和其他用于线性 BCE 预测的最新方法进行了广泛的基准测试和交叉测试,结果表明,EpitopeVec 在准确性和曲线下面积方面的性能优于其他方法。由于预测性能取决于各自抗原的物种来源(病毒、细菌和真核生物),我们还在一个大型病毒数据集上训练了我们的方法,以创建一个具有改进交叉测试性能的专用线性病毒 BCE 预测器。

可用性和实现

该软件可在 https://github.com/hzi-bifo/epitope-prediction 上获得。

补充信息

补充数据可在生物信息学在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96d5/8652027/6a865f83568e/btab467f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验