Rodriguez Jesse, Gupta Nitin, Smith Richard D, Pevzner Pavel A
Department of Computer Science and Engineering, University of California San Diego, La Jolla, California 92093, USA.
J Proteome Res. 2008 Jan;7(1):300-5. doi: 10.1021/pr0705035. Epub 2007 Dec 8.
Trypsin is the most commonly used enzyme in mass spectrometry for protein digestion with high substrate specificity. Many peptide identification algorithms incorporate these specificity rules as filtering criteria. A generally accepted "Keil rule" is that trypsin cleaves next to arginine or lysine, but not before proline. Since this rule was derived two decades ago based on a small number of experimentally confirmed cleavages, we decided to re-examine it using 14.5 million tandem spectra (2 orders of magnitude increase in the number of observed tryptic cleavages). Our analysis revealed a surprisingly large number of cleavages before proline. We examine several hypotheses to explain these cleavages and argue that trypsin specificity rules used in peptide identification algorithms should be modified to "legitimatize" cleavages before proline. Our approach can be applied to analyze any protease, and we further argue that specificity rules for other enzymes should also be re-evaluated based on statistical evidence derived from large MS/MS data sets.
胰蛋白酶是质谱分析中最常用于蛋白质消化的酶,具有较高的底物特异性。许多肽段鉴定算法将这些特异性规则作为筛选标准。一个普遍接受的“凯尔规则”是,胰蛋白酶在精氨酸或赖氨酸旁边切割,但不在脯氨酸之前切割。由于这条规则是二十年前基于少量经实验证实的切割推导出来的,我们决定使用1450万个串联质谱图重新审视它(观察到的胰蛋白酶切割数量增加了两个数量级)。我们的分析揭示了脯氨酸之前存在惊人数量的切割。我们研究了几个假说来解释这些切割,并认为肽段鉴定算法中使用的胰蛋白酶特异性规则应修改为将脯氨酸之前的切割“合法化”。我们的方法可应用于分析任何蛋白酶,并且我们进一步认为,其他酶的特异性规则也应基于从大型MS/MS数据集获得的统计证据进行重新评估。