Wang Juan, Wang Xiao-Yu, Shu Mao, Wang Yuan-Qiang, Lin Yong, Wang Li, Cheng Xiao-Ming, Lin Zhi-Hua
College of Pharmacy and Bioengineering, Chongqing University of Technology, Chongqing 400050, PR China.
Protein Pept Lett. 2011 Sep;18(9):956-63. doi: 10.2174/092986611796011437.
MHC-epitope binding plays a key role in the cellular immune response. Accurate prediction of MHC-epitope binding affinity can greatly expedite epitope screening by reducing costs and experimental effort. In this paper, 13 T descriptors, which derived from 544 physicochemical properties of the natural amino acids, were used to characterize 4 MHC class I alleles epitope peptide sequences, the optimal QSAR models were constructed by using stepwise regression combines with multiple linear regression (STR-MLR). For HLA-A0201, HLA-A0203, HLA-A0206 and HLA-A1101 alleles, the leave one out cross validation values (Q(2)(train)) were 0.581, 0.553, 0.525 and 0.588, the correlation coefficients (R(2)(train)) of training datasets were 0.607, 0.582, 0.556 and 0.606, the correlation coefficients (R(2)(test)) of test datasets were 0.533, 0.506, 0.501 and 0.502, respectively. The results showed that all models can obtain good performance for prediction and explain the mechanism of interaction between MHC and epitope. The descriptors will be useful in structure characterization and activity prediction of peptide sequences.
MHC表位结合在细胞免疫反应中起关键作用。准确预测MHC表位结合亲和力可通过降低成本和实验工作量极大地加快表位筛选。本文利用从天然氨基酸的544种物理化学性质衍生而来的13个T描述符来表征4种MHC I类等位基因表位肽序列,采用逐步回归结合多元线性回归(STR-MLR)构建了最优QSAR模型。对于HLA-A0201、HLA-A0203、HLA-A0206和HLA-A1101等位基因,留一法交叉验证值(Q(2)(train))分别为0.581、0.553、0.525和0.588,训练数据集的相关系数(R(2)(train))分别为0.607、0.582、0.556和0.606,测试数据集的相关系数(R(2)(test))分别为0.533、0.506、0.501和0.502。结果表明,所有模型在预测方面都能获得良好性能,并能解释MHC与表位之间的相互作用机制。这些描述符将有助于肽序列的结构表征和活性预测。