von der Lieth Claus-Wilhelm, Bohne-Lang Andreas, Lohmann Klaus Karl, Frank Martin
German Cancer Research Center, Central Spectroscopic Department B090, Im Neuenheimer Feld 280, D-69120 Heidelberg, Germany.
Brief Bioinform. 2004 Jun;5(2):164-78. doi: 10.1093/bib/5.2.164.
The term 'glycomics' describes the scientific attempt to identify and study all the glycan molecules - the glycome - synthesised by an organism. The aim is to create a cell-by-cell catalogue of glycosyltransferase expression and detected glycan structures. The current status of databases and bioinformatics tools, which are still in their infancy, is reviewed. The structures of glycans as secondary gene products cannot be easily predicted from the DNA sequence. Glycan sequences cannot be described by a simple linear one-letter code as each pair of monosaccharides can be linked in several ways and branched structures can be formed. Few of the bioinformatics algorithms developed for genomics/proteomics can be directly adapted for glycomics. The development of algorithms, which allow a rapid, automatic interpretation of mass spectra to identify glycan structures is currently the most active field of research. The lack of generally accepted ways to normalise glycan structures and exchange glycan formats hampers an efficient cross-linking and the automatic exchange of distributed data. The upcoming glycomics should accept that unrestricted dissemination of scientific data accelerates scientific findings and initiates a number of new initiatives to explore the data.
“糖组学”一词描述了一种科学尝试,即识别和研究生物体合成的所有聚糖分子——糖组。其目的是创建一份逐个细胞的糖基转移酶表达和检测到的聚糖结构目录。本文综述了仍处于起步阶段的数据库和生物信息学工具的现状。作为二级基因产物的聚糖结构无法轻易从DNA序列中预测出来。聚糖序列不能用简单的线性单字母代码来描述,因为每一对单糖都可以通过多种方式连接并形成分支结构。为基因组学/蛋白质组学开发的生物信息学算法很少能直接应用于糖组学。目前,开发能够快速、自动解释质谱以识别聚糖结构的算法是最活跃的研究领域。缺乏普遍接受的聚糖结构标准化方法和聚糖格式交换方法,阻碍了高效的交叉链接和分布式数据的自动交换。即将到来的糖组学应该认识到,科学数据的无限制传播会加速科学发现,并启动一些新举措来探索这些数据。