Abelin Jennifer G, Keskin Derin B, Sarkizova Siranush, Hartigan Christina R, Zhang Wandi, Sidney John, Stevens Jonathan, Lane William, Zhang Guang Lan, Eisenhaure Thomas M, Clauser Karl R, Hacohen Nir, Rooney Michael S, Carr Steven A, Wu Catherine J
Broad Institute of MIT and Harvard, Cambridge, MA, USA.
Broad Institute of MIT and Harvard, Cambridge, MA, USA; Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA; Department of Medicine, Brigham and Women's Hospital, Boston, MA, 02115, USA; Harvard Medical School, Boston, MA, 02115, USA; Department of Computer Science, Metropolitan College, Boston University, Boston, MA, 02215, USA.
Immunity. 2017 Feb 21;46(2):315-326. doi: 10.1016/j.immuni.2017.02.007.
Identification of human leukocyte antigen (HLA)-bound peptides by liquid chromatography-tandem mass spectrometry (LC-MS/MS) is poised to provide a deep understanding of rules underlying antigen presentation. However, a key obstacle is the ambiguity that arises from the co-expression of multiple HLA alleles. Here, we have implemented a scalable mono-allelic strategy for profiling the HLA peptidome. By using cell lines expressing a single HLA allele, optimizing immunopurifications, and developing an application-specific spectral search algorithm, we identified thousands of peptides bound to 16 different HLA class I alleles. These data enabled the discovery of subdominant binding motifs and an integrative analysis quantifying the contribution of factors critical to epitope presentation, such as protein cleavage and gene expression. We trained neural-network prediction algorithms with our large dataset (>24,000 peptides) and outperformed algorithms trained on datasets of peptides with measured affinities. We thus demonstrate a strategy for systematically learning the rules of endogenous antigen presentation.
通过液相色谱-串联质谱法(LC-MS/MS)鉴定人类白细胞抗原(HLA)结合肽,有望深入了解抗原呈递的潜在规则。然而,一个关键障碍是多个HLA等位基因共表达所产生的模糊性。在此,我们实施了一种可扩展的单等位基因策略来分析HLA肽组。通过使用表达单个HLA等位基因的细胞系、优化免疫纯化,并开发一种特定应用的光谱搜索算法,我们鉴定出数千种与16种不同HLA I类等位基因结合的肽。这些数据有助于发现亚优势结合基序,并进行综合分析,以量化对表位呈递至关重要的因素(如蛋白质切割和基因表达)的贡献。我们用大型数据集(>24,000个肽)训练神经网络预测算法,其性能优于在具有测量亲和力的肽数据集上训练的算法。因此,我们展示了一种系统学习内源性抗原呈递规则的策略。