Lian Yao, Ge Meng, Pan Xian-Ming
The Key Laboratory of Bioinformatics, Ministry of Education, School of Life Sciences, Tsinghua University, Beijing, 100084, China.
CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China.
BMC Bioinformatics. 2014 Dec 19;15(1):414. doi: 10.1186/s12859-014-0414-y.
B-cell epitopes have been studied extensively due to their immunological applications, such as peptide-based vaccine development, antibody production, and disease diagnosis and therapy. Despite several decades of research, the accurate prediction of linear B-cell epitopes has remained a challenging task.
In this work, based on the antigen's primary sequence information, a novel linear B-cell epitope prediction model was developed using the multiple linear regression (MLR). A 10-fold cross-validation test on a large non-redundant dataset was performed to evaluate the performance of our model. To alleviate the problem caused by the noise of negative dataset, 300 experiments utilizing 300 sub-datasets were performed. We achieved overall sensitivity of 81.8%, precision of 64.1% and area under the receiver operating characteristic curve (AUC) of 0.728.
We have presented a reliable method for the identification of linear B cell epitope using antigen's primary sequence information. Moreover, a web server EPMLR has been developed for linear B-cell epitope prediction: http://www.bioinfo.tsinghua.edu.cn/epitope/EPMLR/ .
由于B细胞表位在免疫应用方面的研究,如基于肽的疫苗开发、抗体生产以及疾病诊断和治疗等,其已得到广泛研究。尽管经过数十年的研究,但准确预测线性B细胞表位仍然是一项具有挑战性的任务。
在这项工作中,基于抗原的一级序列信息,利用多元线性回归(MLR)开发了一种新型线性B细胞表位预测模型。在一个大型非冗余数据集上进行了10倍交叉验证测试,以评估我们模型的性能。为了缓解阴性数据集噪声所导致的问题,利用300个子数据集进行了300次实验。我们实现了81.8%的总体灵敏度、64.1%的精度以及0.728的受试者工作特征曲线下面积(AUC)。
我们提出了一种利用抗原一级序列信息鉴定线性B细胞表位的可靠方法。此外,还开发了一个用于线性B细胞表位预测的网络服务器EPMLR:http://www.bioinfo.tsinghua.edu.cn/epitope/EPMLR/ 。