National Genomics Data Center, Bio-Med Big Data Center, Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Xuhui, Shanghai 200031, China.
College of Computer, Hubei University of Education, 129 Second Gaoxin Road, Wuhan Hi-Tech Zone, Wu Han 430205, China.
Database (Oxford). 2020 Jan 1;2020. doi: 10.1093/database/baaa047.
Glycosyltransferases (GTs), a large class of carbohydrate-active enzymes, adds glycosyl moieties to various substrates to generate multiple bioactive compounds, including natural products with pharmaceutical or agrochemical values. Here, we first collected comprehensive information on GTs, including amino acid sequences, coding region sequences, available tertiary structures, protein classification families, catalytic reactions and metabolic pathways. Then, we developed sequence search and molecular docking processes for GTs, resulting in a GTs database (GTDB). In the present study, 520 179 GTs from approximately 21 647 species that involved in 394 kinds of different reactions were deposited in GTDB. GTDB has the following useful features: (i) text search is provided for retrieving the complete details of a query by combining multiple identifiers and data sources; (ii) a convenient browser allows users to browse data by different classifications and download data in batches; (iii) BLAST is offered for searching against pre-defined sequences, which can facilitate the annotation of the biological functions of query GTs; and lastly, (iv) GTdock using AutoDock Vina performs docking simulations of several GTs with the same single acceptor and displays the results based on 3Dmol.js allowing easy view of models.
糖基转移酶(GTs)是一大类碳水化合物活性酶,它将糖基添加到各种底物中,生成多种具有生物活性的化合物,包括具有药物或农用化学品价值的天然产物。在这里,我们首先收集了 GTs 的综合信息,包括氨基酸序列、编码区序列、可用的三级结构、蛋白质分类家族、催化反应和代谢途径。然后,我们开发了 GTs 的序列搜索和分子对接过程,从而产生了 GTs 数据库(GTDB)。在本研究中,GTDB 中储存了来自约 21647 个物种的 520179 个 GTs,涉及 394 种不同的反应。GTDB 具有以下有用的功能:(i)文本搜索提供了通过组合多个标识符和数据源来检索查询的完整详细信息的功能;(ii)方便的浏览器允许用户按不同的分类浏览数据,并批量下载数据;(iii)BLAST 提供了针对预定义序列的搜索,这可以方便查询 GTs 的生物功能注释;最后,(iv)使用 AutoDock Vina 的 GTdock 对多个具有相同单一受体的 GTs 进行对接模拟,并基于 3Dmol.js 显示结果,方便模型查看。