Research and Development Group, Protein Metrics Inc, Cupertino, California, USA.
Research and Development Group, Protein Metrics Inc, Cupertino, California, USA.
Mol Cell Proteomics. 2021;20:100011. doi: 10.1074/mcp.RA120.002260. Epub 2020 Dec 8.
Glycopeptides in peptide or digested protein samples pose a number of analytical and bioinformatics challenges beyond those posed by unmodified peptides or peptides with smaller posttranslational modifications. Exact structural elucidation of glycans is generally beyond the capability of a single mass spectrometry experiment, so a reasonable level of identification for tandem mass spectrometry, taken by several glycopeptide software tools, is that of peptide sequence and glycan composition, meaning the number of monosaccharides of each distinct mass, e.g., HexNAc(2)Hex(5) rather than man5. Even at this level, however, glycopeptide analysis poses challenges: finding glycopeptide spectra when they are a tiny fraction of the total spectra; assigning spectra with unanticipated glycans, not in the initial glycan database; and finding, scoring, and labeling diagnostic peaks in tandem mass spectra. Here, we discuss recent improvements to Byonic, a glycoproteomics search program, that address these three issues. Byonic now supports filtering spectra by m/z peaks, so that the user can limit attention to spectra with diagnostic peaks, e.g., at least two out of three of 204.087 for HexNAc, 274.092 for NeuAc (with water loss), and 366.139 for HexNAc-Hex, all within a set mass tolerance, e.g., ± 0.01 Da. Also, new is glycan "wildcard" search, which allows an unspecified mass within a user-set mass range to be applied to N- or O-linked glycans and enables assignment of spectra with unanticipated glycans. Finally, the next release of Byonic supports user-specified peak annotations from user-defined posttranslational modifications. We demonstrate the utility of these new software features by finding previously unrecognized glycopeptides in publicly available data, including glycosylated neuropeptides from rat brain.
糖肽在肽或消化的蛋白质样品中引起了许多分析和生物信息学方面的挑战,这些挑战超出了未修饰的肽或具有较小翻译后修饰的肽所带来的挑战。糖基的确切结构通常超出了单个质谱实验的能力范围,因此,几种糖肽软件工具对串联质谱的合理识别水平是肽序列和糖组成,即每个独特质量的单糖数量,例如 HexNAc(2)Hex(5)而不是 man5。然而,即使在这个水平上,糖肽分析也带来了挑战:在它们只是总谱的一小部分时找到糖肽谱;分配谱中未预期的聚糖,而不是在初始聚糖数据库中;在串联质谱中找到、评分和标记诊断峰。在这里,我们讨论了 Byonic 的最新改进,这是一种糖蛋白质组学搜索程序,可解决这三个问题。Byonic 现在支持通过 m/z 峰过滤谱,因此用户可以将注意力限制在具有诊断峰的谱上,例如,HexNAc 至少有两个,274.092 个 NeuAc(带有水丢失)和 366.139 个 HexNAc-Hex,都在设定的质量公差内,例如 ± 0.01 Da。此外,新的是聚糖“通配符”搜索,它允许将用户设定的质量范围内的未指定质量应用于 N-或 O-连接的聚糖,并能够分配具有未预期聚糖的谱。最后,下一个版本的 Byonic 支持用户指定的来自用户定义的翻译后修饰的峰注释。我们通过在公开可用的数据中发现以前未被识别的糖肽,包括来自大鼠脑的糖基化神经肽,证明了这些新软件功能的实用性。