Dong Qian, Yan Xinjian, Kilpatrick Lisa E, Liang Yuxue, Mirokhin Yuri A, Roth Jeri S, Rudnick Paul A, Stein Stephen E
From the ‡Biomolecular Measurement Division, National Institute of Standards and Technology, 100 Bureau Drive, Stop 8362, Gaithersburg, Maryland 20899, United States.
Mol Cell Proteomics. 2014 Sep;13(9):2435-49. doi: 10.1074/mcp.O113.037135. Epub 2014 Jun 2.
This work presents a method for creating a mass spectral library containing tandem spectra of identifiable peptide ions in the tryptic digestion of a single protein. Human serum albumin (HSA(1)) was selected for this purpose owing to its ubiquity, high level of characterization and availability of digest data. The underlying experimental data consisted of ∼3000 one-dimensional LC-ESI-MS/MS runs with ion-trap fragmentation. In order to generate a wide range of peptides, studies covered a broad set of instrument and digestion conditions using multiple sources of HSA and trypsin. Computer methods were developed to enable the reliable identification and reference spectrum extraction of all peptide ions identifiable by current sequence search methods. This process made use of both MS2 (tandem) spectra and MS1 (electrospray) data. Identified spectra were generated for 2918 different peptide ions, using a variety of manually-validated filters to ensure spectrum quality and identification reliability. The resulting library was composed of 10% conventional tryptic and 29% semitryptic peptide ions, along with 42% tryptic peptide ions with known or unknown modifications, which included both analytical artifacts and post-translational modifications (PTMs) present in the original HSA. The remaining 19% contained unexpected missed-cleavages or were under/over alkylated. The methods described can be extended to create equivalent spectral libraries for any target protein. Such libraries have a number of applications in addition to their known advantages of speed and sensitivity, including the ready re-identification of known PTMs, rejection of artifact spectra and a means of assessing sample and digestion quality.
这项工作提出了一种方法,用于创建一个质谱图库,该图库包含单个蛋白质胰蛋白酶消化中可识别肽离子的串联质谱。由于人血清白蛋白(HSA(1))无处不在、表征水平高且有消化数据可用,因此选择它用于此目的。基础实验数据由约3000次一维液相色谱 - 电喷雾串联质谱(LC - ESI - MS/MS)运行组成,采用离子阱碎裂方式。为了生成广泛的肽段,研究涵盖了使用多种来源的HSA和胰蛋白酶的一系列仪器和消化条件。开发了计算机方法,以实现通过当前序列搜索方法可识别的所有肽离子的可靠鉴定和参考光谱提取。这个过程利用了二级质谱(串联)光谱和一级质谱(电喷雾)数据。使用各种手动验证的过滤器来确保光谱质量和鉴定可靠性,为2918个不同的肽离子生成了鉴定光谱。所得的图库由10%的常规胰蛋白酶肽离子、29%的半胰蛋白酶肽离子以及42%具有已知或未知修饰的胰蛋白酶肽离子组成,其中包括原始HSA中存在的分析假象和翻译后修饰(PTM)。其余19%包含意外的未切割肽段或烷基化不足/过度的肽段。所描述的方法可以扩展到为任何目标蛋白质创建等效的光谱图库。除了其已知的速度和灵敏度优势外,此类图库还有许多应用,包括已知PTM的快速重新鉴定、假象光谱的排除以及评估样品和消化质量的一种手段。