Department of Chemistry , University of Wisconsin , 1101 University Avenue , Madison , Wisconsin 53706 , United States.
Biophysics Graduate Program , University of Wisconsin , 413 Bock Laboratories, 1525 Linden Drive , Madison , Wisconsin 53706 , United States.
J Proteome Res. 2019 Oct 4;18(10):3671-3680. doi: 10.1021/acs.jproteome.9b00339. Epub 2019 Sep 18.
Complex human biomolecular processes are made possible by the diversity of human proteoforms. Constructing proteoform families, groups of proteoforms derived from the same gene, is one way to represent this diversity. Comprehensive, high-confidence identification of human proteoforms remains a central challenge in mass spectrometry-based proteomics. We have previously reported a strategy for proteoform identification using intact-mass measurements, and we have since improved that strategy by mass calibration based on search results, the use of a global post-translational modification discovery database, and the integration of top-down proteomics results with intact-mass analysis. In the present study, we combine these strategies for enhanced proteoform identification in total cell lysate from the Jurkat human T lymphocyte cell line. We collected, processed, and integrated three types of proteomics data (NeuCode-labeled intact-mass, label-free top-down, and multi-protease bottom-up) to maximize the number of confident proteoform identifications. The integrated analysis revealed 5950 unique experimentally observed proteoforms, which were assembled into 848 proteoform families. Twenty percent of the observed proteoforms were confidently identified at a 3.9% false discovery rate, representing 1207 unique proteoforms derived from 484 genes.
复杂的人类生物分子过程是由人类蛋白质形式的多样性所促成的。构建蛋白质形式家族,即源自同一基因的蛋白质形式群体,是表示这种多样性的一种方式。基于质谱的蛋白质组学中,全面、高可信度的人类蛋白质形式鉴定仍然是一个核心挑战。我们之前曾报道过一种使用完整质量测量进行蛋白质形式鉴定的策略,此后我们通过基于搜索结果的质量校准、使用全球翻译后修饰发现数据库以及整合自上而下的蛋白质组学结果与完整质量分析改进了该策略。在本研究中,我们将这些策略结合起来,以提高 Jurkat 人 T 淋巴细胞系总细胞裂解物中的蛋白质形式鉴定能力。我们收集、处理和整合了三种类型的蛋白质组学数据(NeuCode 标记的完整质量、无标记的自上而下和多蛋白酶的自下而上),以最大限度地增加可靠蛋白质形式鉴定的数量。综合分析揭示了 5950 个独特的实验观察到的蛋白质形式,它们被组装成 848 个蛋白质形式家族。20%的观察到的蛋白质形式在 3.9%的假阳性率下被自信地鉴定,代表了 484 个基因中 1207 个独特的蛋白质形式。