Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, DK-2800 Lyngby, Denmark.
BMC Bioinformatics. 2009 Sep 18;10:296. doi: 10.1186/1471-2105-10-296.
The major histocompatibility complex (MHC) molecule plays a central role in controlling the adaptive immune response to infections. MHC class I molecules present peptides derived from intracellular proteins to cytotoxic T cells, whereas MHC class II molecules stimulate cellular and humoral immunity through presentation of extracellularly derived peptides to helper T cells. Identification of which peptides will bind a given MHC molecule is thus of great importance for the understanding of host-pathogen interactions, and large efforts have been placed in developing algorithms capable of predicting this binding event.
Here, we present a novel artificial neural network-based method, NN-align that allows for simultaneous identification of the MHC class II binding core and binding affinity. NN-align is trained using a novel training algorithm that allows for correction of bias in the training data due to redundant binding core representation. Incorporation of information about the residues flanking the peptide-binding core is shown to significantly improve the prediction accuracy. The method is evaluated on a large-scale benchmark consisting of six independent data sets covering 14 human MHC class II alleles, and is demonstrated to outperform other state-of-the-art MHC class II prediction methods.
The NN-align method is competitive with the state-of-the-art MHC class II peptide binding prediction algorithms. The method is publicly available at http://www.cbs.dtu.dk/services/NetMHCII-2.0.
主要组织相容性复合体 (MHC) 分子在控制对感染的适应性免疫反应方面起着核心作用。MHC I 类分子将源自细胞内蛋白质的肽段呈递给细胞毒性 T 细胞,而 MHC II 类分子通过将源自细胞外的肽段呈递给辅助 T 细胞来刺激细胞和体液免疫。因此,确定哪些肽段将与特定的 MHC 分子结合对于理解宿主-病原体相互作用非常重要,并且已经进行了大量努力来开发能够预测这种结合事件的算法。
在这里,我们提出了一种新的基于人工神经网络的方法 NN-align,该方法允许同时识别 MHC II 类结合核心和结合亲和力。NN-align 使用一种新的训练算法进行训练,该算法允许纠正由于结合核心表示冗余而导致的训练数据中的偏差。将关于肽结合核心侧翼残基的信息纳入考虑可显著提高预测准确性。该方法在由六个独立数据集组成的大规模基准上进行了评估,涵盖了 14 个人类 MHC II 类等位基因,并且被证明优于其他最先进的 MHC II 类预测方法。
NN-align 方法与最先进的 MHC II 类肽结合预测算法具有竞争力。该方法可在 http://www.cbs.dtu.dk/services/NetMHCII-2.0 上公开获得。