Hogan Reuben A, Pepi Lauren E, Riley Nicholas M, Chalkley Robert J
University of California, San Francisco, San Francisco, CA, USA.
Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA.
Anal Bioanal Chem. 2025 Apr;417(10):1985-2001. doi: 10.1007/s00216-025-05780-9. Epub 2025 Mar 18.
Glycoproteomics is a rapidly developing field, and data analysis has been stimulated by several technological innovations. As a result, there are many software tools from which to choose; and each comes with unique features that can be difficult to compare. This work presents a head-to-head comparison of five modern analytical software: Byonic, Protein Prospector, MSFraggerGlyco, pGlyco3, and GlycoDecipher. To enable a meaningful comparison, parameter variables were minimized. One potential confounding variable is the glycan database that informs glycoproteomic searches. We performed glycomic profiling of the samples and used the output to construct matched glycan databases for each software. Up to 17,000 glycopeptide spectra were identified across three replicates of wild-type SH-SY5Y cells. There was overlap among all software for glycoproteins identified, locations of glycosites, and glycans; but there was no clear winner. Incorporation of several comparative criteria was critically important for learning the most information in this study and should be used more broadly when assessing software. A single criterion, such as number of glycopeptide spectra found, is not sufficient. We present evidence that suggests Byonic reports many spurious results at the glycoprotein and glycosite level. Overall, our results indicate that glycoproteomic searches should involve more than one software, excluding the current version of Byonic, to generate confidence by consensus. It may be useful to consider software with peptide-first approaches and with glycan-first approaches.
糖蛋白质组学是一个快速发展的领域,数据分析受到了多项技术创新的推动。因此,有许多软件工具可供选择;而且每个工具都有独特的功能,难以进行比较。这项工作对五种现代分析软件进行了直接比较:Byonic、Protein Prospector、MSFraggerGlyco、pGlyco3和GlycoDecipher。为了进行有意义的比较,将参数变量降至最低。一个潜在的混杂变量是用于糖蛋白质组学搜索的聚糖数据库。我们对样品进行了糖组分析,并利用输出结果为每个软件构建了匹配的聚糖数据库。在野生型SH-SY5Y细胞的三个重复样本中,共鉴定出多达17,000个糖肽谱。在所有软件鉴定出的糖蛋白、糖基化位点位置和聚糖方面存在重叠;但没有明显的优胜者。纳入几个比较标准对于在本研究中获取最多信息至关重要,并且在评估软件时应更广泛地使用。单一标准,如发现的糖肽谱数量,是不够的。我们提供的证据表明,Byonic在糖蛋白和糖基化位点水平报告了许多虚假结果。总体而言,我们的结果表明,糖蛋白质组学搜索应该使用不止一种软件(不包括当前版本的Byonic),通过共识来产生可信度。考虑采用先肽方法和先聚糖方法的软件可能会有所帮助。