Frey Lewis J
Biomedical Informatics Center, Medical University of South Carolina, 135 Cannon Street, MSC 200, Charleston, SC, 29425, USA,
Methods Mol Biol. 2015;1229:271-87. doi: 10.1007/978-1-4939-1714-3_23.
Glycomics researchers have identified the need for integrated database systems for collecting glycomics information in a consistent format. The goal is to create a resource for knowledge discovery and dissemination to wider research communities. This has the potential to extend the research community to include biologists, clinicians, chemists, and computer scientists. This chapter discusses the technology and approach needed to create integrated data resources to empower the broader community to leverage extant glycomics data. The focus is on glycosaminoglycan (GAGs) and proteoglycan research, but the approach can be generalized. The methods described span the development of glycomics standards from CarbBank to Glyco Connection Tables. The existence of integrated data sets provides a foundation for novel methods of analysis such as machine learning for knowledge discovery. The implications of predictive analysis are examined in relation to disease biomarker to expand the target audience of GAG and proteoglycan research.
糖组学研究人员已经认识到需要有集成数据库系统,以便以一致的格式收集糖组学信息。目标是创建一个知识发现资源,并将其传播给更广泛的研究群体。这有可能将研究群体扩展到包括生物学家、临床医生、化学家以及计算机科学家。本章讨论创建集成数据资源所需的技术和方法,以使更广泛的群体能够利用现有的糖组学数据。重点是糖胺聚糖(GAGs)和蛋白聚糖研究,但该方法具有通用性。所描述的方法涵盖了从碳水化合物数据库(CarbBank)到糖连接表的糖组学标准的发展。集成数据集的存在为诸如用于知识发现的机器学习等新型分析方法提供了基础。结合疾病生物标志物研究了预测分析的意义,以扩大GAG和蛋白聚糖研究的目标受众。