Lebrón Ricardo, Gómez-Martín Cristina, Carpena Pedro, Bernaola-Galván Pedro, Barturen Guillermo, Hackenberg Michael, Oliver José L
Department of Genetics, Faculty of Science, University of Granada, Campus de Fuentenueva s/n, 18071-Granada, Spain.
Laboratory of Bioinformatics, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada, Spain.
Nucleic Acids Res. 2017 Jan 4;45(D1):D97-D103. doi: 10.1093/nar/gkw996. Epub 2016 Oct 27.
The 2017 update of NGSmethDB stores whole genome methylomes generated from short-read data sets obtained by bisulfite sequencing (WGBS) technology. To generate high-quality methylomes, stringent quality controls were integrated with third-part software, adding also a two-step mapping process to exploit the advantages of the new genome assembly models. The samples were all profiled under constant parameter settings, thus enabling comparative downstream analyses. Besides a significant increase in the number of samples, NGSmethDB now includes two additional data-types, which are a valuable resource for the discovery of methylation epigenetic biomarkers: (i) differentially methylated single-cytosines; and (ii) methylation segments (i.e. genome regions of homogeneous methylation). The NGSmethDB back-end is now based on MongoDB, a NoSQL hierarchical database using JSON-formatted documents and dynamic schemas, thus accelerating sample comparative analyses. Besides conventional database dumps, track hubs were implemented, which improved database access, visualization in genome browsers and comparative analyses to third-part annotations. In addition, the database can be also accessed through a RESTful API. Lastly, a Python client and a multiplatform virtual machine allow for program-driven access from user desktop. This way, private methylation data can be compared to NGSmethDB without the need to upload them to public servers. Database website: http://bioinfo2.ugr.es/NGSmethDB.
NGSmethDB 2017年更新版本存储了通过亚硫酸氢盐测序(WGBS)技术从短读数据集生成的全基因组甲基化组。为了生成高质量的甲基化组,将严格的质量控制与第三方软件相结合,还增加了两步映射过程以利用新基因组组装模型的优势。所有样本均在恒定参数设置下进行分析,从而实现下游的比较分析。除了样本数量显著增加外,NGSmethDB现在还包括另外两种数据类型,它们是发现甲基化表观遗传生物标志物的宝贵资源:(i)差异甲基化的单胞嘧啶;(ii)甲基化片段(即均匀甲基化的基因组区域)。NGSmethDB后端现在基于MongoDB,这是一个使用JSON格式文档和动态模式的非关系型分层数据库,从而加速了样本比较分析。除了传统的数据库转储外,还实现了轨迹中心,这改善了数据库访问、在基因组浏览器中的可视化以及与第三方注释的比较分析。此外,该数据库还可以通过RESTful API访问。最后,一个Python客户端和一个多平台虚拟机允许从用户桌面进行程序驱动的访问。通过这种方式,可以将私人甲基化数据与NGSmethDB进行比较,而无需将其上传到公共服务器。数据库网站:http://bioinfo2.ugr.es/NGSmethDB。