Aoki Kiyoko F, Mamitsuka Hiroshi, Akutsu Tatsuya, Kanehisa Minoru
Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, Kyoto 611-0011, Japan.
Bioinformatics. 2005 Apr 15;21(8):1457-63. doi: 10.1093/bioinformatics/bti193. Epub 2004 Dec 7.
Glycans are the third major class of biomolecules following DNA and proteins. They are extremely vital for the functioning of multicellular organisms. However, comparing the fast development of sequence analysis techniques, informatics work on glycans have a long way to go. Alignment algorithms for glycan tree structures are one of the foremost concerns. In addition, the statistical analysis of these algorithms in terms of biological significance needs to be addressed.
We developed a tree-structure alignment algorithm for glycans and performed a statistical analysis of these alignment scores such that biologically interesting features could be captured into a score matrix for glycans. We generated our score matrix in a manner similar to BLOSUM, but with slight variations to accomodate our glycan data, including the incorporation of linkage information. We verified the effectiveness of our new glycan score matrix by illustrating how well the resulting score matrix entries correspond with biological knowledge. Future work for even better improvements with the use of a variety of score matrices for different subclasses of glycans due to their complexity is also discussed.
The glycan score matrix can be downloaded from http://kanehisa.kuicr.kyoto-u.ac.jp/Paper/kcam/glycanMatrix0.1.txt.
聚糖是继DNA和蛋白质之后的第三大类生物分子。它们对于多细胞生物的功能极其重要。然而,与序列分析技术的快速发展相比,聚糖的信息学研究还有很长的路要走。聚糖树结构的比对算法是首要关注的问题之一。此外,需要解决这些算法在生物学意义方面的统计分析问题。
我们开发了一种用于聚糖的树结构比对算法,并对这些比对分数进行了统计分析,以便将生物学上有趣的特征纳入聚糖的分数矩阵中。我们以类似于BLOSUM的方式生成了我们的分数矩阵,但有一些细微的变化以适应我们的聚糖数据,包括纳入连接信息。我们通过说明所得分数矩阵条目与生物学知识的对应程度,验证了我们新的聚糖分数矩阵的有效性。还讨论了由于聚糖的复杂性,未来使用针对不同聚糖子类的各种分数矩阵进行进一步改进的工作。
聚糖分数矩阵可从http://kanehisa.kuicr.kyoto-u.ac.jp/Paper/kcam/glycanMatrix0.1.txt下载。