GlycoCT——一种统一的碳水化合物序列格式。

GlycoCT-a unifying sequence format for carbohydrates.

作者信息

Herget S, Ranzinger R, Maass K, Lieth C-W V D

机构信息

German Cancer Research Center, Molecular Structure Analysis (W160), Molecular Modeling Group, Im Neuenheimer Feld 280, D-69120 Heidelberg, Germany.

出版信息

Carbohydr Res. 2008 Aug 11;343(12):2162-71. doi: 10.1016/j.carres.2008.03.011. Epub 2008 Mar 13.

DOI:10.1016/j.carres.2008.03.011

PMID:18436199

Abstract

As part of the EUROCarbDB project (www.eurocarbdb.org) we have carefully analyzed the encoding capabilities of all existing carbohydrate sequence formats and the content of publically available structure databases. We have found that none of the existing structural encoding schemata are capable of coping with the full complexity to be expected for experimentally derived structural carbohydrate sequence data across all taxonomic sources. This gap motivated us to define an encoding scheme for complex carbohydrates, named GlycoCT, to overcome the current limitations. This new format is based on a connection table approach, instead of a linear encoding scheme, to describe the carbohydrate sequences, with a controlled vocabulary to name monosaccharides, adopting IUPAC rules to generate a consistent, machine-readable nomenclature. The format uses a block concept to describe frequently occurring special features of carbohydrate sequences like repeating units. It exists in two variants, a condensed form and a more verbose XML syntax. Sorting rules assure the uniqueness of the condensed form, thus making it suitable as a direct primary key for database applications, which rely on unique identifiers. GlycoCT encompasses the capabilities of the heterogeneous landscape of digital encoding schemata in glycomics and is thus a step forward on the way to a unified and broadly accepted sequence format in glycobioinformatics.

摘要

作为EUROCarbDB项目（www.eurocarbdb.org）的一部分，我们仔细分析了所有现有碳水化合物序列格式的编码能力以及公开可用结构数据库的内容。我们发现，现有的结构编码模式均无法应对来自所有分类学来源的实验衍生结构碳水化合物序列数据所预期的全部复杂性。这一差距促使我们定义一种用于复杂碳水化合物的编码方案，即糖链CT（GlycoCT），以克服当前的局限性。这种新格式基于连接表方法，而非线性编码方案，来描述碳水化合物序列，并使用一个控制词汇表来命名单糖，采用国际纯粹与应用化学联合会（IUPAC）规则生成一致的、机器可读的命名法。该格式使用块概念来描述碳水化合物序列中频繁出现的特殊特征，如重复单元。它有两种变体，一种压缩形式和一种更详细的XML语法。排序规则确保了压缩形式的唯一性，因此使其适合作为依赖唯一标识符的数据库应用的直接主键。糖链CT涵盖了糖组学中数字编码模式的异构格局的能力，因此是朝着糖生物学信息学中统一且被广泛接受的序列格式迈出的一步。

相似文献

GlycoCT-a unifying sequence format for carbohydrates.GlycoCT——一种统一的碳水化合物序列格式。

Carbohydr Res. 2008 Aug 11;343(12):2162-71. doi: 10.1016/j.carres.2008.03.011. Epub 2008 Mar 13.

Glycome-DB.org: a portal for querying across the digital world of carbohydrate sequences.Glycome-DB.org：一个用于查询碳水化合物序列数字世界的门户。

Glycobiology. 2009 Dec;19(12):1563-7. doi: 10.1093/glycob/cwp137. Epub 2009 Sep 16.

GlycomeDB - integration of open-access carbohydrate structure databases.糖库数据库——开放获取碳水化合物结构数据库的整合

BMC Bioinformatics. 2008 Sep 19;9:384. doi: 10.1186/1471-2105-9-384.

A global representation of the carbohydrate structures: a tool for the analysis of glycan.碳水化合物结构的全局表示：一种用于聚糖分析的工具。

Genome Inform. 2005;16(1):214-22.

Detection of monosaccharide types from coordinates.从坐标中检测单糖类型。

Genome Inform. 2007;19:3-14.

A molecular builder for carbohydrates: application to polysaccharides and complex carbohydrates.一种碳水化合物的分子构建工具：应用于多糖和复合碳水化合物。

Biopolymers. 1996 Sep;39(3):417-33. doi: 10.1002/(SICI)1097-0282(199609)39:3%3C417::AID-BIP13%3E3.0.CO;2-8.

WURCS 2.0 Update To Encapsulate Ambiguous Carbohydrate Structures.WURCS 2.0 更新以封装模糊碳水化合物结构。

J Chem Inf Model. 2017 Apr 24;57(4):632-637. doi: 10.1021/acs.jcim.6b00650. Epub 2017 Mar 22.

Accurate prediction for atomic-level protein design and its application in diversifying the near-optimal sequence space.原子水平蛋白质设计的准确预测及其在扩展近最优序列空间中的应用。

Proteins. 2009 May 15;75(3):682-705. doi: 10.1002/prot.22280.

BALLDock/SLICK: a new method for protein-carbohydrate docking.BALLDock/SLICK：一种蛋白质-碳水化合物对接的新方法。

J Chem Inf Model. 2008 Aug;48(8):1616-25. doi: 10.1021/ci800103u. Epub 2008 Jul 23.

Tools for glycomics: mapping interactions of carbohydrates in biological systems.糖组学工具：绘制生物系统中碳水化合物的相互作用图谱。

Chembiochem. 2004 Oct 4;5(10):1375-83. doi: 10.1002/cbic.200400106.

引用本文的文献

GNOme, an ontology for glycan naming and subsumption.GNOme，一种用于聚糖命名和归类的本体。

Anal Bioanal Chem. 2025 Apr;417(10):1961-1973. doi: 10.1007/s00216-025-05757-8. Epub 2025 Feb 8.

GP-Plotter: Flexible Spectral Visualization for Proteomics Data with Emphasis on Glycoproteomics Analysis.GP绘图仪：用于蛋白质组学数据的灵活光谱可视化，重点是糖蛋白质组学分析。

Genomics Proteomics Bioinformatics. 2024 Dec 3;22(5). doi: 10.1093/gpbjnl/qzae069.

Toward integration of glycan chemical databases: an algorithm and software tool for extracting sugars from chemical structures.迈向聚糖化学数据库整合：一种从化学结构中提取糖类的算法及软件工具。

Anal Bioanal Chem. 2025 Feb;417(5):945-956. doi: 10.1007/s00216-024-05508-1. Epub 2024 Aug 30.

Functional implications of glycans and their curation: insights from the workshop held at the 16th Annual International Biocuration Conference in Padua, Italy.聚糖的功能意义及其整理：在意大利帕多瓦举行的第 16 届国际生物整理会议研讨会上获得的认识。

Database (Oxford). 2024 Aug 13;2024. doi: 10.1093/database/baae073.

UniCarb-DB: An MS/MS Experimental Glycomic Fragmentation Database.UniCarb-DB：一个 MS/MS 实验糖组学片段数据库。

Methods Mol Biol. 2024;2836:77-96. doi: 10.1007/978-1-0716-4007-4_6.

Carbohydrate Structure Database: current state and recent developments.碳水化合物结构数据库：当前状态与最新进展

Anal Bioanal Chem. 2025 Feb;417(5):1025-1034. doi: 10.1007/s00216-024-05383-w. Epub 2024 Jun 25.

Decoding glycosylation potential from protein structure across human glycoproteins with a multi-view recurrent neural network.利用多视图循环神经网络从人类糖蛋白的蛋白质结构中解码糖基化潜力。

bioRxiv. 2024 May 23:2024.05.15.594334. doi: 10.1101/2024.05.15.594334.

Introduction of a human- and keyboard-friendly N-glycan nomenclature.一种对人类和键盘友好的N-聚糖命名法介绍。

Beilstein J Org Chem. 2024 Mar 15;20:607-620. doi: 10.3762/bjoc.20.53. eCollection 2024.

Alterations of Glycan Composition in Aerobic Granular Sludge during the Adaptation to Seawater Conditions.适应海水条件过程中好氧颗粒污泥中聚糖组成的变化

ACS ES T Water. 2023 Dec 27;4(1):279-286. doi: 10.1021/acsestwater.3c00625. eCollection 2024 Jan 12.

GlycoDraw: a python implementation for generating high-quality glycan figures.GlycoDraw：一个用于生成高质量聚糖图的 Python 实现。

Glycobiology. 2023 Dec 25;33(11):927-934. doi: 10.1093/glycob/cwad063.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

GlycoCT——一种统一的碳水化合物序列格式。

GlycoCT-a unifying sequence format for carbohydrates.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献