Eavenson Matthew, Kochut Krys J, Miller John A, Ranzinger René, Tiemeyer Michael, Aoki Kazuhiro, York William S
Department of Computer Science.
Department of Computer Science
Glycobiology. 2015 Jan;25(1):66-73. doi: 10.1093/glycob/cwu090. Epub 2014 Aug 27.
Most currently available glycan structure databases use their own proprietary structure representation schema and contain numerous annotation errors. These cause problems when glycan databases are used for the annotation or mining of data generated in the laboratory. Due to the complexity of glycan structures, curating these databases is often a tedious and labor-intensive process. However, rigorously validating glycan structures can be made easier with a curation workflow that incorporates a structure-matching algorithm that compares candidate glycans to a canonical tree that embodies structural features consistent with established mechanisms for the biosynthesis of a particular class of glycans. To this end, we have implemented Qrator, a web-based application that uses a combination of external literature and database references, user annotations and canonical trees to assist and guide researchers in making informed decisions while curating glycans. Using this application, we have started the curation of large numbers of N-glycans, O-glycans and glycosphingolipids. Our curation workflow allows creating and extending canonical trees for these classes of glycans, which have subsequently been used to improve the curation workflow.
目前大多数可用的聚糖结构数据库都使用其专有的结构表示模式,并且包含大量注释错误。当聚糖数据库用于注释或挖掘实验室生成的数据时,这些问题就会出现。由于聚糖结构的复杂性,整理这些数据库通常是一个繁琐且劳动密集型的过程。然而,通过一种整理工作流程可以使严格验证聚糖结构变得更容易,该工作流程包含一种结构匹配算法,该算法将候选聚糖与一个规范树进行比较,该规范树体现了与特定类聚糖生物合成的既定机制一致的结构特征。为此,我们开发了Qrator,这是一个基于网络的应用程序,它结合了外部文献和数据库参考、用户注释以及规范树,以协助和指导研究人员在整理聚糖时做出明智的决策。使用这个应用程序,我们已经开始整理大量的N -聚糖、O -聚糖和糖鞘脂。我们的整理工作流程允许为这些类别的聚糖创建和扩展规范树,这些规范树随后被用于改进整理工作流程。