Suppr超能文献

生物数据库:用于数据驱动生物学的统一预处理和自动注释数据集集合。

BioDataome: a collection of uniformly preprocessed and automatically annotated datasets for data-driven biology.

机构信息

Computer Science Department, University of Crete, Voutes Campus, 70013 Heraklion, Crete, Greece.

Gnosis Data Analysis PC, Palaiokapa 64, 71305 Heraklion, Crete, Greece.

出版信息

Database (Oxford). 2018 Jan 1;2018. doi: 10.1093/database/bay011.

Abstract

Biotechnology revolution generates a plethora of omics data with an exponential growth pace. Therefore, biological data mining demands automatic, 'high quality' curation efforts to organize biomedical knowledge into online databases. BioDataome is a database of uniformly preprocessed and disease-annotated omics data with the aim to promote and accelerate the reuse of public data. We followed the same preprocessing pipeline for each biological mart (microarray gene expression, RNA-Seq gene expression and DNA methylation) to produce ready for downstream analysis datasets and automatically annotated them with disease-ontology terms. We also designate datasets that share common samples and automatically discover control samples in case-control studies. Currently, BioDataome includes ∼5600 datasets, ∼260 000 samples spanning ∼500 diseases and can be easily used in large-scale massive experiments and meta-analysis. All datasets are publicly available for querying and downloading via BioDataome web application. We demonstrate BioDataome's utility by presenting exploratory data analysis examples. We have also developed BioDataome R package found in: https://github.com/mensxmachina/BioDataome/.Database URL: http://dataome.mensxmachina.org/.

摘要

生物技术革命产生了大量的组学数据,其增长速度呈指数级增长。因此,生物数据挖掘需要自动的、“高质量”的策展工作,将生物医学知识组织到在线数据库中。BioDataome 是一个统一预处理和疾病注释的组学数据库,旨在促进和加速公共数据的再利用。我们对每个生物集市(微阵列基因表达、RNA-Seq 基因表达和 DNA 甲基化)都采用相同的预处理管道,生成可用于下游分析的数据集,并使用疾病本体论术语对其进行自动注释。我们还指定了共享共同样本的数据集,并在病例对照研究中自动发现对照样本。目前,BioDataome 包含约 5600 个数据集,约 260000 个样本,涵盖约 500 种疾病,可轻松用于大规模的大规模实验和荟萃分析。所有数据集均可通过 BioDataome 网络应用程序进行查询和下载。我们通过展示探索性数据分析示例来演示 BioDataome 的实用性。我们还开发了可在:https://github.com/mensxmachina/BioDataome/ 找到的 BioDataome R 包。数据库 URL:http://dataome.mensxmachina.org/。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b79/5836265/ce14fd1c3970/bay011f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验