迈向I类和II类小鼠主要组织相容性复合体-肽结合亲和力的预测：使用定量构效关系的计算机生物信息学逐步指南

Toward the prediction of class I and II mouse major histocompatibility complex-peptide-binding affinity: in silico bioinformatic step-by-step guide using quantitative structure-activity relationships.

作者信息

Hattotuwagama Channa K, Doytchinova Irini A, Flower Darren R

机构信息

The Jenner Institute, University of Oxford, Berkshire, UK.

出版信息

Methods Mol Biol. 2007;409:227-45. doi: 10.1007/978-1-60327-118-9_16.

DOI:10.1007/978-1-60327-118-9_16

PMID:18450004

Abstract

Quantitative structure-activity relationship (QSAR) analysis is a cornerstone of modern informatics. Predictive computational models of peptide-major histocompatibility complex (MHC)-binding affinity based on QSAR technology have now become important components of modern computational immunovaccinology. Historically, such approaches have been built around semiqualitative, classification methods, but these are now giving way to quantitative regression methods. We review three methods--a 2D-QSAR additive-partial least squares (PLS) and a 3D-QSAR comparative molecular similarity index analysis (CoMSIA) method--which can identify the sequence dependence of peptide-binding specificity for various class I MHC alleles from the reported binding affinities (IC50) of peptide sets. The third method is an iterative self-consistent (ISC) PLS-based additive method, which is a recently developed extension to the additive method for the affinity prediction of class II peptides. The QSAR methods presented here have established themselves as immunoinformatic techniques complementary to existing methodology, useful in the quantitative prediction of binding affinity: current methods for the in silico identification of T-cell epitopes (which form the basis of many vaccines, diagnostics, and reagents) rely on the accurate computational prediction of peptide-MHC affinity. We have reviewed various human and mouse class I and class II allele models. Studied alleles comprise HLA-A0101, HLA-A0201, HLA-A0202, HLA-A0203, HLA-A0206, HLA-A0301, HLA-A1101, HLA-A3101, HLA-A6801, HLA-A6802, HLA-B3501, H2-K(k), H2-K(b), H2-D(b) HLA-DRB10101, HLA-DRB10401, HLA-DRB10701, I-A(b), I-A(d), I-A(k), I-A(S), I-E(d), and I-E(k). In this chapter we show a step-by-step guide into predicting the reliability and the resulting models to represent an advance on existing methods. The peptides used in this study are available from the AntiJen database (http://www.jenner.ac.uk/AntiJen). The PLS method is available commercially in the SYBYL molecular modeling software package. The resulting models, which can be used for accurate T-cell epitope prediction, will be made are freely available online at the URL http://www.jenner.ac.uk/MHCPred.

摘要

定量构效关系（QSAR）分析是现代信息学的基石。基于QSAR技术的肽-主要组织相容性复合体（MHC）结合亲和力预测计算模型现已成为现代计算免疫疫苗学的重要组成部分。从历史上看，此类方法一直围绕半定性分类方法构建，但现在这些方法正逐渐被定量回归方法所取代。我们综述了三种方法——二维QSAR加法-偏最小二乘法（PLS）和三维QSAR比较分子相似性指数分析（CoMSIA）方法——它们可以从已报道的肽集结合亲和力（IC50）中识别各种I类MHC等位基因的肽结合特异性的序列依赖性。第三种方法是基于迭代自洽（ISC）PLS的加法方法，它是最近开发的用于II类肽亲和力预测的加法方法的扩展。本文介绍的QSAR方法已成为与现有方法互补的免疫信息学技术，可用于结合亲和力的定量预测：目前用于计算机识别T细胞表位（许多疫苗、诊断试剂和试剂的基础）的方法依赖于肽-MHC亲和力的准确计算预测。我们综述了各种人类和小鼠I类和II类等位基因模型。研究的等位基因包括HLA-A0101、HLA-A0201、HLA-A0202、HLA-A0203、HLA-A0206、HLA-A0301、HLA-A1101、HLA-A3101、HLA-A6801、HLA-A6802、HLA-B3501、H2-K(k)、H2-K(b)、H2-D(b)、HLA-DRB10101、HLA-DRB10401、HLA-DRB10701、I-A(b)、I-A(d)、I-A(k)、I-A(S)、I-E(d)和I-E(k)。在本章中，我们展示了一个逐步指导，用于预测可靠性以及所得模型，以代表对现有方法的改进。本研究中使用的肽可从AntiJen数据库（http://www.jenner.ac.uk/AntiJen）获取。PLS方法可在SYBYL分子建模软件包中商业获得。所得模型可用于准确的T细胞表位预测，将在网址http://www.jenner.ac.uk/MHCPred上免费在线提供。