Bioinformatics Research Center, School of Computer Engineering, Nanyang Technological University, Singapore.
BMC Bioinformatics. 2012;13 Suppl 17(Suppl 17):S20. doi: 10.1186/1471-2105-13-S17-S20. Epub 2012 Dec 13.
Prediction of B-cell epitopes from antigens is useful to understand the immune basis of antibody-antigen recognition, and is helpful in vaccine design and drug development. Tremendous efforts have been devoted to this long-studied problem, however, existing methods have at least two common limitations. One is that they only favor prediction of those epitopes with protrusive conformations, but show poor performance in dealing with planar epitopes. The other limit is that they predict all of the antigenic residues of an antigen as belonging to one single epitope even when multiple non-overlapping epitopes of an antigen exist.
In this paper, we propose to divide an antigen surface graph into subgraphs by using a Markov Clustering algorithm, and then we construct a classifier to distinguish these subgraphs as epitope or non-epitope subgraphs. This classifier is then taken to predict epitopes for a test antigen. On a big data set comprising 92 antigen-antibody PDB complexes, our method significantly outperforms the state-of-the-art epitope prediction methods, achieving 24.7% higher averaged f-score than the best existing models. In particular, our method can successfully identify those epitopes with a non-planarity which is too small to be addressed by the other models. Our method can also detect multiple epitopes whenever they exist.
Various protrusive and planar patches at the surface of antigens can be distinguishable by using graphical models combined with unsupervised clustering and supervised learning ideas. The difficult problem of identifying multiple epitopes from an antigen can be made easied by using our subgraph approach. The outstanding residue combinations found in the supervised learning will be useful for us to form new hypothesis in future studies.
从抗原预测 B 细胞表位有助于理解抗体-抗原识别的免疫基础,有助于疫苗设计和药物开发。尽管人们为此进行了大量研究,但现有的方法至少存在两个共同的局限性。一种是它们只有利于预测那些具有突出构象的表位,但在处理平面表位方面表现不佳。另一个限制是,即使抗原存在多个不重叠的表位,它们也会将抗原的所有抗原性残基预测为属于一个单一的表位。
在本文中,我们提出通过使用马尔可夫聚类算法将抗原表面图划分为子图,然后构建一个分类器来区分这些子图是表位或非表位子图。然后,该分类器用于预测测试抗原的表位。在包含 92 个抗原-抗体 PDB 复合物的大数据集中,我们的方法显著优于最先进的表位预测方法,平均 f-score 比现有最佳模型高 24.7%。特别是,我们的方法可以成功识别那些非平面性太小而无法被其他模型解决的表位。我们的方法还可以检测到多个表位,只要它们存在。
通过使用图形模型结合无监督聚类和监督学习的思想,可以区分抗原表面的各种突出和平面斑块。通过使用我们的子图方法,可以更容易地从抗原中识别多个表位。在监督学习中发现的突出残基组合将有助于我们在未来的研究中形成新的假设。