Université Paris Cité, Institut Pasteur, CNRS UAR 2024, Mass Spectrometry for Biology Unit, Paris 75015, France.
Applied Bioinformatics, Computer Science Department, University of Tübingen, Tübingen 72076, Germany.
J Proteome Res. 2023 Jul 7;22(7):2199-2217. doi: 10.1021/acs.jproteome.2c00673. Epub 2023 May 26.
Generating top-down tandem mass spectra (MS/MS) from complex mixtures of proteoforms benefits from improvements in fractionation, separation, fragmentation, and mass analysis. The algorithms to match MS/MS to sequences have undergone a parallel evolution, with both spectral alignment and match-counting approaches producing high-quality proteoform-spectrum matches (PrSMs). This study assesses state-of-the-art algorithms for top-down identification (ProSight PD, TopPIC, MSPathFinderT, and pTop) in their yield of PrSMs while controlling false discovery rate. We evaluated deconvolution engines (ThermoFisher Xtract, Bruker AutoMSn, Matrix Science Mascot Distiller, TopFD, and FLASHDeconv) in both ThermoFisher Orbitrap-class and Bruker maXis Q-TOF data (PXD033208) to produce consistent precursor charges and mass determinations. Finally, we sought post-translational modifications (PTMs) in proteoforms from bovine milk (PXD031744) and human ovarian tissue. Contemporary identification workflows produce excellent PrSM yields, although approximately half of all identified proteoforms from these four pipelines were specific to only one workflow. Deconvolution algorithms disagree on precursor masses and charges, contributing to identification variability. Detection of PTMs is inconsistent among algorithms. In bovine milk, 18% of PrSMs produced by pTop and TopMG were singly phosphorylated, but this percentage fell to 1% for one algorithm. Applying multiple search engines produces more comprehensive assessments of experiments. Top-down algorithms would benefit from greater interoperability.
从复杂的蛋白质形式混合物中生成自上而下的串联质谱 (MS/MS) 得益于分馏、分离、碎片化和质量分析的改进。将 MS/MS 与序列匹配的算法也经历了平行发展,谱图对齐和匹配计数方法都产生了高质量的蛋白质谱匹配 (PrSM)。本研究评估了最新的自上而下鉴定算法 (ProSight PD、TopPIC、MSPathFinderT 和 pTop) 在控制假发现率的情况下生成 PrSM 的产量。我们评估了热释光萃取器 (ThermoFisher Xtract、Bruker AutoMSn、Matrix Science Mascot Distiller、TopFD 和 FLASHDeconv) 在 ThermoFisher Orbitrap 类和 Bruker maXis Q-TOF 数据 (PXD033208) 中的性能,以产生一致的前体电荷和质量测定值。最后,我们在牛乳腺 (PXD031744) 和人卵巢组织 (PXD031744) 中的蛋白质形式中寻找翻译后修饰 (PTMs)。当代鉴定工作流程产生了出色的 PrSM 产量,尽管来自这四个管道的所有鉴定蛋白质形式中,约有一半是特定于单个工作流程的。去卷积算法在前体质量和电荷上存在差异,导致鉴定的可变性。PTMs 的检测在算法之间并不一致。在牛乳中,pTop 和 TopMG 产生的 18% PrSM 是单磷酸化的,但对于一种算法,这一比例降至 1%。应用多个搜索引擎可以更全面地评估实验。自上而下的算法将受益于更大的互操作性。