Proteome Informatics Group, SIB Swiss Institute of Bioinformatics , Geneva, 1211, Switzerland.
University of Geneva , Geneva, 1211, Switzerland.
Anal Chem. 2017 Oct 17;89(20):10932-10940. doi: 10.1021/acs.analchem.7b02754. Epub 2017 Sep 25.
Tandem mass spectrometry, when combined with liquid chromatography and applied to complex mixtures, produces large amounts of raw data, which needs to be analyzed to identify molecular structures. This technique is widely used, particularly in glycomics. Due to a lack of high throughput glycan sequencing software, glycan spectra are predominantly sequenced manually. A challenge for writing glycan-sequencing software is that there is no direct template that can be used to infer structures detectable in an organism. To help alleviate this bottleneck, we present Glycoforest 1.0, a partial de novo algorithm for sequencing glycan structures based on MS/MS spectra. Glycoforest was tested on two data sets (human gastric and salmon mucosa O-linked glycomes) for which MS/MS spectra were annotated manually. Glycoforest generated the human validated structure for 92% of test cases. The correct structure was found as the best scoring match for 70% and among the top 3 matches for 83% of test cases. In addition, the Glycoforest algorithm detected glycan structures from MS/MS spectra missing a manual annotation. In total 1532 MS/MS previously unannotated spectra were annotated by Glycoforest. A portion containing 521 spectra was manually checked confirming that Glycoforest annotated an additional 50 MS/MS spectra overlooked during manual annotation.
串联质谱法与液相色谱法相结合应用于复杂混合物时会产生大量原始数据,需要对这些数据进行分析以确定分子结构。该技术应用广泛,尤其是在糖组学中。由于缺乏高通量聚糖测序软件,糖谱主要通过手动测序。编写聚糖测序软件的一个挑战是,没有可以用来推断在生物体中可检测到的结构的直接模板。为了帮助缓解这一瓶颈,我们提出了 Glycoforest 1.0,这是一种基于 MS/MS 谱的聚糖结构从头测序的部分算法。Glycoforest 在两个数据集(人胃和鲑鱼黏膜 O-连接聚糖组)上进行了测试,这些数据集的 MS/MS 谱已手动注释。Glycoforest 为 92%的测试案例生成了人类验证结构。对于 70%的测试案例,正确结构是最佳评分匹配;对于 83%的测试案例,正确结构是前三个匹配之一。此外,Glycoforest 算法还从缺少手动注释的 MS/MS 谱中检测到了聚糖结构。总共通过 Glycoforest 注释了 1532 个以前未注释的 MS/MS 谱。其中包含 521 个谱的一部分进行了手动检查,确认 Glycoforest 注释了在手动注释过程中遗漏的另外 50 个 MS/MS 谱。