Department of Chemistry, University of Georgia, Athens, GA, 30606, USA.
J Am Soc Mass Spectrom. 2018 Sep;29(9):1802-1811. doi: 10.1007/s13361-018-1969-z. Epub 2018 May 22.
The biological interactions between glycosaminoglycans (GAGs) and other biomolecules are heavily influenced by structural features of the glycan. The structure of GAGs can be assigned using tandem mass spectrometry (MS), but analysis of these data, to date, requires manually interpretation, a slow process that presents a bottleneck to the broader deployment of this approach to solving biologically relevant problems. Automated interpretation remains a challenge, as GAG biosynthesis is not template-driven, and therefore, one cannot predict structures from genomic data, as is done with proteins. The lack of a structure database, a consequence of the non-template biosynthesis, requires a de novo approach to interpretation of the mass spectral data. We propose a model for rapid, high-throughput GAG analysis by using an approach in which candidate structures are scored for the likelihood that they would produce the features observed in the mass spectrum. To make this approach tractable, a genetic algorithm is used to greatly reduce the search-space of isomeric structures that are considered. The time required for analysis is significantly reduced compared to an approach in which every possible isomer is considered and scored. The model is coded in a software package using the MATLAB environment. This approach was tested on tandem mass spectrometry data for long-chain, moderately sulfated chondroitin sulfate oligomers that were derived from the proteoglycan bikunin. The bikunin data was previously interpreted manually. Our approach examines glycosidic fragments to localize SO modifications to specific residues and yields the same structures reported in literature, only much more quickly. Graphical Abstract ᅟ.
糖胺聚糖 (GAGs) 与其他生物分子之间的生物相互作用受聚糖结构特征的强烈影响。GAG 的结构可以使用串联质谱 (MS) 来指定,但迄今为止,对这些数据的分析需要手动解释,这是一个缓慢的过程,成为更广泛地采用这种方法解决与生物学相关问题的瓶颈。自动化解释仍然是一个挑战,因为 GAG 生物合成不是模板驱动的,因此,不能像蛋白质那样从基因组数据预测结构。缺乏结构数据库是由于非模板生物合成造成的,这需要一种从头开始的方法来解释质谱数据。我们提出了一种通过使用候选结构对产生质谱中观察到的特征的可能性进行评分的方法,来实现快速、高通量 GAG 分析的模型。为了使这种方法可行,使用遗传算法大大减少了考虑的异构结构的搜索空间。与考虑并评分每个可能异构体的方法相比,分析所需的时间大大减少。该模型使用 MATLAB 环境在软件包中进行编码。该方法在先前通过手动解释的来自蛋白聚糖 bikunin 的长链、中度硫酸化软骨素寡糖的串联质谱数据上进行了测试。该方法检查糖苷片段以将 SO 修饰定位到特定残基,并生成与文献中报道的相同的结构,但速度要快得多。