KEGG作为一种糖组信息学资源。

KEGG as a glycome informatics resource.

作者信息

Hashimoto Kosuke, Goto Susumu, Kawano Shin, Aoki-Kinoshita Kiyoko F, Ueda Nobuhisa, Hamajima Masami, Kawasaki Toshisuke, Kanehisa Minoru

机构信息

Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto 611-0011, Japan.

出版信息

Glycobiology. 2006 May;16(5):63R-70R. doi: 10.1093/glycob/cwj010. Epub 2005 Jul 13.

DOI:10.1093/glycob/cwj010

PMID:16014746

Abstract

Bioinformatics approaches to carbohydrate research have recently begun using large amounts of protein and carbohydrate data. In this field called glycome informatics, the foremost necessity is a comprehensive resource for genome-scale bioinformatics analysis of glycan data. Although the accumulation of experimental data may be useful as a reference of biological and biochemical information on carbohydrates, this is insufficient for bioinformatics analysis. Thus, we have developed a glycome informatics resource (http://www.genome.jp/kegg/glycan/) in KEGG (Kyoto Encyclopedia of Genes and Genomes), an integrated knowledge base of protein networks, genomic information, and chemical information. This review describes three noteworthy features: (1) GLYCAN, a database of carbohydrate structures; (2) glycan-related pathways; and (3) Composite Structure Map (CSM), a map illustrating all possible variations of carbohydrate structures within organisms. GLYCAN includes two useful tools: an intuitive drawing tool called KegDraw, and an efficient glycan search and alignment tool called KEGG Carbohydrate Matcher (KCaM). KEGG's glycan biosynthesis and metabolism pathways, integrating carbohydrate structures, proteins, and reactions, are also a pivotal resource. CSM is constructed as a bridge between carbohydrate functions and structures. CSM is able to display, for example, expression data of glycosyltransferases in a compact manner. In all the KEGG resources, various objects including KEGG pathways, chemical compounds, as well as carbohydrate structures are commonly represented as graphs, which are widely studied and utilized in the computer science field.

摘要

碳水化合物研究的生物信息学方法最近已开始使用大量蛋白质和碳水化合物数据。在这个被称为糖组信息学的领域，首要需求是一个用于聚糖数据基因组规模生物信息学分析的综合资源。尽管实验数据的积累作为碳水化合物生物学和生物化学信息的参考可能有用，但这对于生物信息学分析来说是不够的。因此，我们在KEGG（京都基因与基因组百科全书）中开发了一个糖组信息学资源（http://www.genome.jp/kegg/glycan/），KEGG是一个整合了蛋白质网络、基因组信息和化学信息的知识库。本综述描述了三个值得注意的特征：（1）GLYCAN，一个碳水化合物结构数据库；（2）与聚糖相关的途径；（3）复合结构图（CSM），一张展示生物体内碳水化合物结构所有可能变体的图谱。GLYCAN包括两个有用的工具：一个名为KegDraw的直观绘图工具，以及一个名为KEGG碳水化合物匹配器（KCaM）的高效聚糖搜索和比对工具。KEGG的聚糖生物合成和代谢途径整合了碳水化合物结构、蛋白质和反应，也是一个关键资源。CSM被构建为碳水化合物功能和结构之间的桥梁。例如，CSM能够以紧凑的方式展示糖基转移酶的表达数据。在所有KEGG资源中，包括KEGG途径、化合物以及碳水化合物结构在内的各种对象通常都表示为图形，这些图形在计算机科学领域得到了广泛研究和应用。