Hart-Smith Gene, Yagoub Daniel, Tay Aidan P, Pickford Russell, Wilkins Marc R
From the ‡New South Wales Systems Biology Initiative, School of Biotechnology and Biomolecular Sciences, and
From the ‡New South Wales Systems Biology Initiative, School of Biotechnology and Biomolecular Sciences, and.
Mol Cell Proteomics. 2016 Mar;15(3):989-1006. doi: 10.1074/mcp.M115.055384. Epub 2015 Dec 23.
All large scale LC-MS/MS post-translational methylation site discovery experiments require methylpeptide spectrum matches (methyl-PSMs) to be identified at acceptably low false discovery rates (FDRs). To meet estimated methyl-PSM FDRs, methyl-PSM filtering criteria are often determined using the target-decoy approach. The efficacy of this methyl-PSM filtering approach has, however, yet to be thoroughly evaluated. Here, we conduct a systematic analysis of methyl-PSM FDRs across a range of sample preparation workflows (each differing in their exposure to the alcohols methanol and isopropyl alcohol) and mass spectrometric instrument platforms (each employing a different mode of MS/MS dissociation). Through (13)CD3-methionine labeling (heavy-methyl SILAC) of Saccharomyces cerevisiae cells and in-depth manual data inspection, accurate lists of true positive methyl-PSMs were determined, allowing methyl-PSM FDRs to be compared with target-decoy approach-derived methyl-PSM FDR estimates. These results show that global FDR estimates produce extremely unreliable methyl-PSM filtering criteria; we demonstrate that this is an unavoidable consequence of the high number of amino acid combinations capable of producing peptide sequences that are isobaric to methylated peptides of a different sequence. Separate methyl-PSM FDR estimates were also found to be unreliable due to prevalent sources of false positive methyl-PSMs that produce high peptide identity score distributions. Incorrect methylation site localizations, peptides containing cysteinyl-S-β-propionamide, and methylated glutamic or aspartic acid residues can partially, but not wholly, account for these false positive methyl-PSMs. Together, these results indicate that the target-decoy approach is an unreliable means of estimating methyl-PSM FDRs and methyl-PSM filtering criteria. We suggest that orthogonal methylpeptide validation (e.g. heavy-methyl SILAC or its offshoots) should be considered a prerequisite for obtaining high confidence methyl-PSMs in large scale LC-MS/MS methylation site discovery experiments and make recommendations on how to reduce methyl-PSM FDRs in samples not amenable to heavy isotope labeling. Data are available via ProteomeXchange with the data identifier PXD002857.
所有大规模的液相色谱-串联质谱(LC-MS/MS)翻译后甲基化位点发现实验都要求在可接受的低错误发现率(FDR)下识别甲基化肽段谱匹配(methyl-PSMs)。为了满足估计的甲基化肽段谱匹配错误发现率,甲基化肽段谱匹配过滤标准通常使用目标-诱饵法来确定。然而,这种甲基化肽段谱匹配过滤方法的有效性尚未得到充分评估。在这里,我们对一系列样品制备工作流程(每种流程在接触甲醇和异丙醇方面有所不同)和质谱仪器平台(每种平台采用不同的串联质谱解离模式)中的甲基化肽段谱匹配错误发现率进行了系统分析。通过对酿酒酵母细胞进行(13)CD3-甲硫氨酸标记(重甲基稳定同位素标记氨基酸定量法,heavy-methyl SILAC)并进行深入的人工数据检查,确定了真正阳性甲基化肽段谱匹配的准确列表,从而能够将甲基化肽段谱匹配错误发现率与目标-诱饵法得出的甲基化肽段谱匹配错误发现率估计值进行比较。这些结果表明,全局错误发现率估计产生的甲基化肽段谱匹配过滤标准极其不可靠;我们证明,这是由于大量氨基酸组合能够产生与不同序列的甲基化肽段等压的肽段序列这一不可避免的结果。由于产生高肽段同一性得分分布的假阳性甲基化肽段谱匹配的普遍来源,单独的甲基化肽段谱匹配错误发现率估计也被发现不可靠。甲基化位点定位错误、含有半胱氨酰-S-β-丙酰胺的肽段以及甲基化的谷氨酸或天冬氨酸残基可以部分但不能完全解释这些假阳性甲基化肽段谱匹配。总之,这些结果表明目标-诱饵法是估计甲基化肽段谱匹配错误发现率和甲基化肽段谱匹配过滤标准的不可靠方法。我们建议,在大规模液相色谱-串联质谱甲基化位点发现实验中,正交甲基化肽段验证(例如重甲基稳定同位素标记氨基酸定量法或其衍生方法)应被视为获得高可信度甲基化肽段谱匹配的先决条件,并就如何降低不适用于重同位素标记的样品中的甲基化肽段谱匹配错误发现率提出建议。数据可通过蛋白质组交换库获取,数据标识符为PXD002857。