Laboratory on AI for Computational Biology, Faculty of Computer Science, HSE University, Moscow, Russian Federation.
Proteomics. 2024 Mar;24(5):e2300145. doi: 10.1002/pmic.202300145. Epub 2023 Sep 19.
Exact p-value (XPV)-based methods for dot product-like score functions-such as the XCorr score implemented in Tide, SEQUEST, Comet or shared peak count-based scoring in MSGF+ and ASPV-provide a fairly good calibration for peptide-spectrum-match (PSM) scoring in database searching-based MS/MS spectrum data identification. Unfortunately, standard XPV methods, in practice, cannot handle high-resolution fragmentation data produced by state-of-the-art mass spectrometers because having smaller bins increases the number of fragment matches that are assigned to incorrect bins and scored improperly. In this article, we present an extension of the XPV method, called the high-resolution exact p-value (HR-XPV) method, which can be used to calibrate PSM scores of high-resolution MS/MS spectra obtained with dot product-like scoring such as the XCorr. The HR-XPV carries remainder masses throughout the fragmentation, allowing them to greatly increase the number of fragments that are properly assigned to the correct bin and, thus, taking advantage of high-resolution data. Using four mass spectrometry data sets, our experimental results demonstrate that HR-XPV produces well-calibrated scores, which in turn results in more trusted spectrum annotations at any false discovery rate level.
基于精确概率值 (XPV) 的方法适用于类似点积的评分函数,如 Tide、SEQUEST、Comet 中实现的 XCorr 评分,或 MSGF+ 和 ASPV 中的共享峰计数评分,这些方法为基于数据库搜索的 MS/MS 谱数据鉴定中的肽谱匹配 (PSM) 评分提供了相当好的校准。不幸的是,在实践中,标准 XPV 方法无法处理最先进的质谱仪产生的高分辨率碎裂数据,因为较小的 bin 会增加分配给错误 bin 并被不当评分的碎片匹配数量。在本文中,我们提出了 XPV 方法的扩展,称为高分辨率精确概率值 (HR-XPV) 方法,可用于校准高分辨率 MS/MS 谱的 PSM 评分,这些高分辨率 MS/MS 谱使用类似点积的评分,如 XCorr。HR-XPV 在整个碎裂过程中携带剩余质量,这使得它们能够大大增加正确分配到正确 bin 的碎片数量,从而利用高分辨率数据。使用四个质谱数据集,我们的实验结果表明,HR-XPV 产生了校准良好的分数,这反过来又在任何假发现率水平下产生了更可信的谱注释。