Institute of Pharmaceutical Sciences, Albert-Ludwigs-Universität Freiburg, Freiburg, Germany.
Center for Infectious Disease and Vaccine Research, La Jolla Institute for Immunology, La Jolla, California, United States of America.
PLoS Comput Biol. 2022 Feb 18;18(2):e1009151. doi: 10.1371/journal.pcbi.1009151. eCollection 2022 Feb.
In-silico methods for the prediction of epitopes can support and improve workflows for vaccine design, antibody production, and disease therapy. So far, the scope of B cell and T cell epitope prediction has been directed exclusively towards peptidic antigens. Nevertheless, various non-peptidic molecular classes can be recognized by immune cells. These compounds have not been systematically studied yet, and prediction approaches are lacking. The ability to predict the epitope activity of non-peptidic compounds could have vast implications; for example, for immunogenic risk assessment of the vast number of drugs and other xenobiotics. Here we present the first general attempt to predict the epitope activity of non-peptidic compounds using the Immune Epitope Database (IEDB) as a source for positive samples. The molecules stored in the Chemical Entities of Biological Interest (ChEBI) database were chosen as background samples. The molecules were clustered into eight homogeneous molecular groups, and classifiers were built for each cluster with the aim of separating the epitopes from the background. Different molecular feature encoding schemes and machine learning models were compared against each other. For those models where a high performance could be achieved based on simple decision rules, the molecular features were then further investigated. Additionally, the findings were used to build a web server that allows for the immunogenic investigation of non-peptidic molecules (http://tools-staging.iedb.org/np_epitope_predictor). The prediction quality was tested with samples from independent evaluation datasets, and the implemented method received noteworthy Receiver Operating Characteristic-Area Under Curve (ROC-AUC) values, ranging from 0.69-0.96 depending on the molecule cluster.
基于计算机的表位预测方法可支持并改进疫苗设计、抗体生产和疾病治疗的工作流程。到目前为止,B 细胞和 T 细胞表位预测的范围仅针对肽抗原。然而,免疫细胞可以识别各种非肽类分子类别。这些化合物尚未得到系统研究,也缺乏预测方法。预测非肽类化合物的表位活性的能力可能具有广泛的意义;例如,可用于评估大量药物和其他异源生物的免疫原性风险。在这里,我们首次尝试使用免疫表位数据库(IEDB)作为阳性样本的来源来预测非肽类化合物的表位活性。选择存储在化学实体生物兴趣(ChEBI)数据库中的分子作为背景样本。将分子聚类成八个同质分子组,并为每个簇构建分类器,目的是将表位与背景区分开来。比较了不同的分子特征编码方案和机器学习模型。对于那些可以基于简单决策规则实现高性能的模型,然后进一步研究了分子特征。此外,这些发现被用于构建一个允许对非肽类分子进行免疫研究的网络服务器(http://tools-staging.iedb.org/np_epitope_predictor)。使用来自独立评估数据集的样本测试了预测质量,所实现的方法获得了有意义的接收器操作特征-曲线下面积(ROC-AUC)值,根据分子簇的不同,范围从 0.69 到 0.96 不等。