Suppr
超能文献

基于距离特征的随机森林算法从 3D 结构预测构象 B 细胞表位

Prediction of conformational B-cell epitopes from 3D structures by random forests with a distance-based feature.

机构信息

School of Computer, Wuhan University, Wuhan 430072, China.

出版信息

BMC Bioinformatics. 2011 Aug 17;12:341. doi: 10.1186/1471-2105-12-341.

DOI:10.1186/1471-2105-12-341

PMID:21846404

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3228550/

Abstract

BACKGROUND

Antigen-antibody interactions are key events in immune system, which provide important clues to the immune processes and responses. In Antigen-antibody interactions, the specific sites on the antigens that are directly bound by the B-cell produced antibodies are well known as B-cell epitopes. The identification of epitopes is a hot topic in bioinformatics because of their potential use in the epitope-based drug design. Although most B-cell epitopes are discontinuous (or conformational), insufficient effort has been put into the conformational epitope prediction, and the performance of existing methods is far from satisfaction.

RESULTS

In order to develop the high-accuracy model, we focus on some possible aspects concerning the prediction performance, including the impact of interior residues, different contributions of adjacent residues, and the imbalanced data which contain much more non-epitope residues than epitope residues. In order to address above issues, we take following strategies. Firstly, a concept of 'thick surface patch' instead of 'surface patch' is introduced to describe the local spatial context of each surface residue, which considers the impact of interior residue. The comparison between the thick surface patch and the surface patch shows that interior residues contribute to the recognition of epitopes. Secondly, statistical significance of the distance distribution difference between non-epitope patches and epitope patches is observed, thus an adjacent residue distance feature is presented, which reflects the unequal contributions of adjacent residues to the location of binding sites. Thirdly, a bootstrapping and voting procedure is adopted to deal with the imbalanced dataset. Based on the above ideas, we propose a new method to identify the B-cell conformational epitopes from 3D structures by combining conventional features and the proposed feature, and the random forest (RF) algorithm is used as the classification engine. The experiments show that our method can predict conformational B-cell epitopes with high accuracy. Evaluated by leave-one-out cross validation (LOOCV), our method achieves the mean AUC value of 0.633 for the benchmark bound dataset, and the mean AUC value of 0.654 for the benchmark unbound dataset. When compared with the state-of-the-art prediction models in the independent test, our method demonstrates comparable or better performance.

CONCLUSIONS

Our method is demonstrated to be effective for the prediction of conformational epitopes. Based on the study, we develop a tool to predict the conformational epitopes from 3D structures, available at http://code.google.com/p/my-project-bpredictor/downloads/list.

摘要

背景

抗原-抗体相互作用是免疫系统中的关键事件，为免疫过程和反应提供了重要线索。在抗原-抗体相互作用中，B 细胞产生的抗体直接结合的抗原上的特定部位被称为 B 细胞表位。由于其在基于表位的药物设计中的潜在用途，表位的鉴定是生物信息学中的一个热门话题。尽管大多数 B 细胞表位是不连续的（或构象的），但在构象表位预测方面的努力还不够，现有方法的性能远不能令人满意。

结果

为了开发高精度模型，我们专注于可能影响预测性能的一些方面，包括内部残基的影响、相邻残基的不同贡献以及包含大量非表位残基的不平衡数据。为了解决上述问题，我们采取了以下策略。首先，引入了“厚表面斑块”的概念来代替“表面斑块”，以描述每个表面残基的局部空间上下文，同时考虑内部残基的影响。厚表面斑块与表面斑块的比较表明，内部残基有助于识别表位。其次，观察到非表位斑块和表位斑块之间距离分布差异的统计显著性，从而提出了一个相邻残基距离特征，反映了相邻残基对结合位点位置的不等贡献。第三，采用自举和投票过程来处理不平衡数据集。基于上述思想，我们提出了一种新的方法，通过结合传统特征和所提出的特征，从 3D 结构中识别 B 细胞构象表位，并使用随机森林（RF）算法作为分类引擎。实验表明，我们的方法可以高精度地预测构象 B 细胞表位。通过留一交叉验证（LOOCV）评估，我们的方法在基准绑定数据集上的平均 AUC 值为 0.633，在基准未绑定数据集上的平均 AUC 值为 0.654。与独立测试中的最新预测模型相比，我们的方法表现出相当或更好的性能。

结论

我们的方法被证明对构象表位的预测是有效的。基于该研究，我们开发了一种从 3D 结构预测构象表位的工具，可在 http://code.google.com/p/my-project-bpredictor/downloads/list 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d549/3228550/4ce77ef66e5d/1471-2105-12-341-1.jpg

相似文献

Prediction of conformational B-cell epitopes from 3D structures by random forests with a distance-based feature.

BMC Bioinformatics. 2011 Aug 17;12:341. doi: 10.1186/1471-2105-12-341.

Computational prediction of conformational B-cell epitopes from antigen primary structures by ensemble learning.

PLoS One. 2012;7(8):e43575. doi: 10.1371/journal.pone.0043575. Epub 2012 Aug 21.

An ensemble method for prediction of conformational B-cell epitopes from antigen sequences.

Comput Biol Chem. 2014 Apr;49:51-8. doi: 10.1016/j.compbiolchem.2014.02.002. Epub 2014 Feb 18.

Conformational B-cell epitopes prediction from sequences using cost-sensitive ensemble classifiers and spatial clustering.

Biomed Res Int. 2014;2014:689219. doi: 10.1155/2014/689219. Epub 2014 Jun 17.

Conformational B-cell epitope prediction method based on antigen preprocessing and mimotopes analysis.

Biomed Res Int. 2015;2015:257030. doi: 10.1155/2015/257030. Epub 2015 Jan 29.

Antibody-protein interactions: benchmark datasets and prediction tools evaluation.

BMC Struct Biol. 2007 Oct 2;7:64. doi: 10.1186/1472-6807-7-64.

ElliPro: a new structure-based tool for the prediction of antibody epitopes.

BMC Bioinformatics. 2008 Dec 2;9:514. doi: 10.1186/1471-2105-9-514.

Conformational B-cell epitope prediction on antigen protein structures: a review of current algorithms and comparison with common binding site prediction methods.

PLoS One. 2013 Apr 19;8(4):e62249. doi: 10.1371/journal.pone.0062249. Print 2013.

Prediction of B-cell epitopes using evolutionary information and propensity scales.

BMC Bioinformatics. 2013;14 Suppl 2(Suppl 2):S10. doi: 10.1186/1471-2105-14-s2-s10.

A novel conformational B-cell epitope prediction method based on mimotope and patch analysis.

J Theor Biol. 2016 Apr 7;394:102-108. doi: 10.1016/j.jtbi.2016.01.021. Epub 2016 Jan 22.

引用本文的文献

Deep learning of antibody epitopes using positional permutation vectors.

Comput Struct Biotechnol J. 2024 Jun 15;23:2695-2707. doi: 10.1016/j.csbj.2024.06.005. eCollection 2024 Dec.

A gene-based phylogenetic analysis and antigenic epitope prediction for strains of avian origin.

Front Vet Sci. 2023 Dec 21;10:1183048. doi: 10.3389/fvets.2023.1183048. eCollection 2023.

DL-TCNN: Deep Learning-based Temporal Convolutional Neural Network for prediction of conformational B-cell epitopes.

3 Biotech. 2023 Sep;13(9):297. doi: 10.1007/s13205-023-03716-7. Epub 2023 Aug 9.

DeepLBCEPred: A Bi-LSTM and multi-scale CNN-based deep learning method for predicting linear B-cell epitopes.

Front Microbiol. 2023 Feb 22;14:1117027. doi: 10.3389/fmicb.2023.1117027. eCollection 2023.

Comprehensive Linear Epitope Prediction System for Host Specificity in .

Viruses. 2022 Jun 22;14(7):1357. doi: 10.3390/v14071357.

A Structure-Based B-cell Epitope Prediction Model Through Combing Local and Global Features.

Front Immunol. 2022 Jul 1;13:890943. doi: 10.3389/fimmu.2022.890943. eCollection 2022.

Conformational epitope matching and prediction based on protein surface spiral features.

BMC Genomics. 2021 May 31;22(Suppl 2):116. doi: 10.1186/s12864-020-07303-5.

Enhancement of conformational B-cell epitope prediction using CluSMOTE.

PeerJ Comput Sci. 2020 Jun 1;6:e275. doi: 10.7717/peerj-cs.275. eCollection 2020.

Computer-guided binding mode identification and affinity improvement of an LRR protein binder without structure determination.

PLoS Comput Biol. 2020 Aug 31;16(8):e1008150. doi: 10.1371/journal.pcbi.1008150. eCollection 2020 Aug.

PEPOP 2.0: new approaches to mimic non-continuous epitopes.

BMC Bioinformatics. 2019 Jul 11;20(1):387. doi: 10.1186/s12859-019-2867-5.

本文引用的文献

Predicting in vitro drug sensitivity using Random Forests.

Bioinformatics. 2011 Jan 15;27(2):220-4. doi: 10.1093/bioinformatics/btq628. Epub 2010 Dec 5.

Identification of conformational B-cell Epitopes in an antigen from its primary sequence.

Immunome Res. 2010 Oct 20;6:6. doi: 10.1186/1745-7580-6-6.

EPSVR and EPMeta: prediction of antigenic epitopes using support vector regression and multiple server results.

BMC Bioinformatics. 2010 Jul 16;11:381. doi: 10.1186/1471-2105-11-381.

Automatic structure classification of small proteins using random forest.

BMC Bioinformatics. 2010 Jul 1;11:364. doi: 10.1186/1471-2105-11-364.

Prediction of protein-RNA binding sites by a random forest method with combined features.

Bioinformatics. 2010 Jul 1;26(13):1616-22. doi: 10.1093/bioinformatics/btq253. Epub 2010 May 18.

Prediction of antigenic epitopes on protein surfaces by consensus scoring.

BMC Bioinformatics. 2009 Sep 22;10:302. doi: 10.1186/1471-2105-10-302.

Epitopia: a web-server for predicting B-cell epitopes.

BMC Bioinformatics. 2009 Sep 14;10:287. doi: 10.1186/1471-2105-10-287.

SEPPA: a computational server for spatial epitope prediction of protein antigens.

Nucleic Acids Res. 2009 Jul;37(Web Server issue):W612-6. doi: 10.1093/nar/gkp417. Epub 2009 May 22.

Prediction of protein-protein interaction sites in sequences and 3D structures by random forests.

PLoS Comput Biol. 2009 Jan;5(1):e1000278. doi: 10.1371/journal.pcbi.1000278. Epub 2009 Jan 30.

COBEpro: a novel system for predicting continuous B-cell epitopes.

Protein Eng Des Sel. 2009 Mar;22(3):113-20. doi: 10.1093/protein/gzn075. Epub 2008 Dec 10.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

基于距离特征的随机森林算法从 3D 结构预测构象 B 细胞表位

Prediction of conformational B-cell epitopes from 3D structures by random forests with a distance-based feature.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译