Suppr超能文献

加权 q-gram 方法用于糖链结构分类。

A weighted q-gram method for glycan structure classification.

机构信息

Advanced Modeling and Applied Computing Laboratory, Department of Mathematics, The University of Hong Kong, Pokfulam Road, Hong Kong.

出版信息

BMC Bioinformatics. 2010 Jan 18;11 Suppl 1(Suppl 1):S33. doi: 10.1186/1471-2105-11-S1-S33.

Abstract

BACKGROUND

Glycobiology pertains to the study of carbohydrate sugar chains, or glycans, in a particular cell or organism. Many computational approaches have been proposed for analyzing these complex glycan structures, which are chains of monosaccharides. The monosaccharides are linked to one another by glycosidic bonds, which can take on a variety of comformations, thus forming branches and resulting in complex tree structures. The q-gram method is one of these recent methods used to understand glycan function based on the classification of their tree structures. This q-gram method assumes that for a certain q, different q-grams share no similarity among themselves. That is, that if two structures have completely different components, then they are completely different. However, from a biological standpoint, this is not the case. In this paper, we propose a weighted q-gram method to measure the similarity among glycans by incorporating the similarity of the geometric structures, monosaccharides and glycosidic bonds among q-grams. In contrast to the traditional q-gram method, our weighted q-gram method admits similarity among q-grams for a certain q. Thus our new kernels for glycan structure were developed and then applied in SVMs to classify glycans.

RESULTS

Two glycan datasets were used to compare the weighted q-gram method and the original q-gram method. The results show that the incorporation of q-gram similarity improves the classification performance for all of the important glycan classes tested.

CONCLUSION

The results in this paper indicate that similarity among q-grams obtained from geometric structure, monosaccharides and glycosidic linkage contributes to the glycan function classification. This is a big step towards the understanding of glycan function based on their complex structures.

摘要

背景

糖生物学涉及特定细胞或生物体中碳水化合物糖链(或聚糖)的研究。已经提出了许多计算方法来分析这些复杂的糖链结构,这些结构是单糖的链。单糖通过糖苷键彼此连接,糖苷键可以具有多种构象,从而形成分支并产生复杂的树状结构。q-gram 方法是最近用于根据其树结构分类来理解糖功能的方法之一。该 q-gram 方法假设对于某个 q,不同的 q-gram 彼此之间没有相似性。也就是说,如果两个结构具有完全不同的成分,那么它们就是完全不同的。然而,从生物学的角度来看,情况并非如此。在本文中,我们提出了一种加权 q-gram 方法,通过合并 q-gram 之间的几何结构、单糖和糖苷键的相似性来衡量聚糖之间的相似性。与传统的 q-gram 方法不同,我们的加权 q-gram 方法允许在某个 q 下 q-gram 之间存在相似性。因此,我们开发了新的聚糖结构核函数,并将其应用于 SVM 中以对聚糖进行分类。

结果

使用两个聚糖数据集来比较加权 q-gram 方法和原始 q-gram 方法。结果表明,q-gram 相似性的合并提高了所有测试的重要聚糖类别的分类性能。

结论

本文的结果表明,从几何结构、单糖和糖苷键获得的 q-gram 之间的相似性有助于糖功能分类。这是朝着基于其复杂结构理解糖功能迈出的重要一步。

相似文献

1
A weighted q-gram method for glycan structure classification.
BMC Bioinformatics. 2010 Jan 18;11 Suppl 1(Suppl 1):S33. doi: 10.1186/1471-2105-11-S1-S33.
2
Development and application of an algorithm to compute weighted multiple glycan alignments.
Bioinformatics. 2017 May 1;33(9):1317-1323. doi: 10.1093/bioinformatics/btw827.
3
Glycan classification with tree kernels.
Bioinformatics. 2007 May 15;23(10):1211-6. doi: 10.1093/bioinformatics/btm090. Epub 2007 Mar 7.
5
Extracting glycan motifs using a biochemicallyweighted kernel.
Bioinformation. 2011;7(8):405-12. doi: 10.6026/97320630007405. Epub 2011 Dec 21.
7
Development of a novel monosaccharide substitution matrix for improved comparison of glycan structures.
Carbohydr Res. 2022 Jan;511:108496. doi: 10.1016/j.carres.2021.108496. Epub 2022 Jan 4.
8
Analyzing Glycan-Binding Profiles Using Weighted Multiple Alignment of Trees.
Methods Mol Biol. 2018;1807:131-140. doi: 10.1007/978-1-4939-8561-6_10.
9
Application of microarrays for deciphering the structure and function of the human glycome.
Mol Cell Proteomics. 2013 Apr;12(4):902-12. doi: 10.1074/mcp.R112.027110. Epub 2013 Feb 14.
10
KCaM (KEGG Carbohydrate Matcher): a software tool for analyzing the structures of carbohydrate sugar chains.
Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W267-72. doi: 10.1093/nar/gkh473.

引用本文的文献

1
GlyNet: a multi-task neural network for predicting protein-glycan interactions.
Chem Sci. 2022 May 16;13(22):6669-6686. doi: 10.1039/d1sc05681f. eCollection 2022 Jun 7.
3
Grammar-based compression approach to extraction of common rules among multiple trees of glycans and RNAs.
BMC Bioinformatics. 2015 Apr 24;16:128. doi: 10.1186/s12859-015-0558-4.
4
Glycan changes: cancer metastasis and anti-cancer vaccines.
J Biosci. 2010 Dec;35(4):665-73. doi: 10.1007/s12038-010-0073-8.

本文引用的文献

2
Glycan classification with tree kernels.
Bioinformatics. 2007 May 15;23(10):1211-6. doi: 10.1093/bioinformatics/btm090. Epub 2007 Mar 7.
3
Extraction of leukemia specific glycan motifs in humans by computational glycomics.
Carbohydr Res. 2005 Oct 17;340(14):2270-8. doi: 10.1016/j.carres.2005.07.012.
4
KEGG as a glycome informatics resource.
Glycobiology. 2006 May;16(5):63R-70R. doi: 10.1093/glycob/cwj010. Epub 2005 Jul 13.
6
Heuristics for chemical compound matching.
Genome Inform. 2003;14:144-53.
8
A score matrix to reveal the hidden links in glycans.
Bioinformatics. 2005 Apr 15;21(8):1457-63. doi: 10.1093/bioinformatics/bti193. Epub 2004 Dec 7.
10
CarbBank.
Glycobiology. 1992 Dec;2(6):505. doi: 10.1093/glycob/2.6.505.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验