Tung Chun-Wei, Ho Shinn-Ying
Institute of Bioinformatics, National Chiao Tung University, Hsinchu, Taiwan.
Bioinformatics. 2007 Apr 15;23(8):942-9. doi: 10.1093/bioinformatics/btm061. Epub 2007 Mar 24.
Both modeling of antigen-processing pathway including major histocompatibility complex (MHC) binding and immunogenicity prediction of those MHC-binding peptides are essential to develop a computer-aided system of peptide-based vaccine design that is one goal of immunoinformatics. Numerous studies have dealt with modeling the immunogenic pathway but not the intractable problem of immunogenicity prediction due to complex effects of many intrinsic and extrinsic factors. Moderate affinity of the MHC-peptide complex is essential to induce immune responses, but the relationship between the affinity and peptide immunogenicity is too weak to use for predicting immunogenicity. This study focuses on mining informative physicochemical properties from known experimental immunogenicity data to understand immune responses and predict immunogenicity of MHC-binding peptides accurately.
This study proposes a computational method to mine a feature set of informative physicochemical properties from MHC class I binding peptides to design a support vector machine (SVM) based system (named POPI) for the prediction of peptide immunogenicity. High performance of POPI arises mainly from an inheritable bi-objective genetic algorithm, which aims to automatically determine the best number m out of 531 physicochemical properties, identify these m properties and tune SVM parameters simultaneously. The dataset consisting of 428 human MHC class I binding peptides belonging to four classes of immunogenicity was established from MHCPEP, a database of MHC-binding peptides (Brusic et al., 1998). POPI, utilizing the m = 23 selected properties, performs well with the accuracy of 64.72% using leave-one-out cross-validation, compared with two sequence alignment-based prediction methods ALIGN (54.91%) and PSI-BLAST (53.23%). POPI is the first computational system for prediction of peptide immunogenicity based on physicochemical properties.
A web server for prediction of peptide immunogenicity (POPI) and the used dataset of MHC class I binding peptides (PEPMHCI) are available at http://iclab.life.nctu.edu.tw/POPI
对抗原加工途径进行建模(包括主要组织相容性复合体(MHC)结合)以及对那些MHC结合肽进行免疫原性预测,对于开发基于肽的疫苗设计计算机辅助系统至关重要,这是免疫信息学的一个目标。许多研究都致力于对抗原加工途径进行建模,但由于许多内在和外在因素的复杂影响,免疫原性预测这一棘手问题尚未得到解决。MHC - 肽复合物的适度亲和力对于诱导免疫反应至关重要,但亲和力与肽免疫原性之间的关系过于微弱,无法用于预测免疫原性。本研究着重从已知的实验免疫原性数据中挖掘信息丰富的物理化学性质,以理解免疫反应并准确预测MHC结合肽的免疫原性。
本研究提出一种计算方法,从MHC I类结合肽中挖掘一组信息丰富的物理化学性质特征集,以设计基于支持向量机(SVM)的系统(名为POPI)来预测肽的免疫原性。POPI的高性能主要源于一种可继承的双目标遗传算法,该算法旨在自动从531种物理化学性质中确定最佳数量m,识别这m种性质并同时调整SVM参数。由428个人MHC I类结合肽组成的数据集,这些肽属于四类免疫原性,是从MHC结合肽数据库MHCPEP(Brusic等人,1998年)中建立的。POPI利用选择出的m = 23种性质,采用留一法交叉验证时,准确率达到64.72%,相比两种基于序列比对的预测方法ALIGN(54.91%)和PSI - BLAST(53.23%)表现出色。POPI是首个基于物理化学性质预测肽免疫原性的计算系统。
用于预测肽免疫原性的网络服务器(POPI)以及所使用的MHC I类结合肽数据集(PEPMHCI)可在http://iclab.life.nctu.edu.tw/POPI获取