Department of Chemistry and Biology, Faculty of Science, Ryerson University, Toronto, Canada; Ryerson Analytical Biochemistry Laboratory (RABL), Canada.
Department of Chemistry and Biology, Faculty of Science, Ryerson University, Toronto, Canada; Ryerson Analytical Biochemistry Laboratory (RABL), Canada.
Anal Biochem. 2020 Jun 15;599:113680. doi: 10.1016/j.ab.2020.113680. Epub 2020 Mar 16.
The Empirical Statistical Model (ESM) for decoy library searching fused the expected amino acid sequence of 18 non-human protein standards to a human decoy library. The ESM assumed a priori the standards were pure such that only the 18 nominal proteins were true positive, all other proteins were false positive, there was no overlap in the peptides of non-human proteins versus human proteins, and that the score distribution of individual peptides would resolve true positive from false positive results or noise. The results of random and independent sampling by LC-ESI-MS/MS indicated that the fundamental assumptions of the ESM were not in good agreement with the actual purity of the commercial test standards and so the method showed a 99.7% false negative rate. The ESM for decoy library searching apparently showed poor agreement with SDS-PAGE using silver staining, goodness of fit of MS/MS spectra by X!TANDEM, FDR correction by Benjamini and Hochberg, or comparison to the observation frequency of null random MS/MS spectra, that all confirmed the standards contain hundreds of proteins with a low FDR of primary structural identification. The protein observation frequency increased with abundance and the log precursor intensity distributions were Gaussian and nearly ideal for relative quantification.
诱饵文库搜索的经验统计模型(ESM)融合了 18 种非人类蛋白质标准品的预期氨基酸序列到人类诱饵文库中。ESM 先验假设标准品是纯的,因此只有 18 个名义上的蛋白质是真正的阳性,所有其他蛋白质都是假阳性,非人类蛋白质与人类蛋白质的肽段没有重叠,并且单个肽段的得分分布将从假阳性结果或噪声中分辨出真正的阳性。LC-ESI-MS/MS 的随机和独立采样结果表明,ESM 的基本假设与商业测试标准的实际纯度不一致,因此该方法显示 99.7%的假阴性率。诱饵文库搜索的 ESM 与 SDS-PAGE 银染、X!TANDEM 的 MS/MS 谱拟合优度、Benjamini 和 Hochberg 的 FDR 校正或与空的随机 MS/MS 谱观察频率的比较明显不一致,所有这些都证实标准品中含有数百种蛋白质,其一级结构鉴定的 FDR 很低。蛋白质观察频率随丰度增加而增加,对数前体强度分布呈高斯分布,几乎适用于相对定量。