Suppr
超能文献

基因表达星云 (GEN)：一个综合性的数据门户，整合了多个物种在 bulk 和单细胞水平的转录组谱。

Gene Expression Nebulas (GEN): a comprehensive data portal integrating transcriptomic profiles across multiple species at both bulk and single-cell levels.

机构信息

National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.

China National Center for Bioinformation, Beijing 100101, China.

出版信息

Nucleic Acids Res. 2022 Jan 7;50(D1):D1016-D1024. doi: 10.1093/nar/gkab878.

DOI:10.1093/nar/gkab878

PMID:34591957

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8728231/

Abstract

Transcriptomic profiling is critical to uncovering functional elements from transcriptional and post-transcriptional aspects. Here, we present Gene Expression Nebulas (GEN, https://ngdc.cncb.ac.cn/gen/), an open-access data portal integrating transcriptomic profiles under various biological contexts. GEN features a curated collection of high-quality bulk and single-cell RNA sequencing datasets by using standardized data processing pipelines and a structured curation model. Currently, GEN houses a large number of gene expression profiles from 323 datasets (157 bulk and 166 single-cell), covering 50 500 samples and 15 540 169 cells across 30 species, which are further categorized into six biological contexts. Moreover, GEN integrates a full range of transcriptomic profiles on expression, RNA editing and alternative splicing for 10 bulk datasets, providing opportunities for users to conduct integrative analysis at both transcriptional and post-transcriptional levels. In addition, GEN provides abundant gene annotations based on value-added curation of transcriptomic profiles and delivers online services for data analysis and visualization. Collectively, GEN presents a comprehensive collection of transcriptomic profiles across multiple species, thus serving as a fundamental resource for better understanding genetic regulatory architecture and functional mechanisms from tissues to cells.

摘要

转录组谱分析对于从转录和转录后方面揭示功能元件至关重要。在这里，我们介绍了基因表达星云（GEN，https://ngdc.cncb.ac.cn/gen/），这是一个开放获取的数据门户，整合了各种生物背景下的转录组谱。GEN 采用标准化的数据处理流程和结构化的策展模型，以精选的高质量批量和单细胞 RNA 测序数据集为特色。目前，GEN 拥有来自 323 个数据集（157 个批量和 166 个单细胞）的大量基因表达谱，涵盖 30 个物种的 50500 个样本和 15540169 个细胞，这些数据集进一步分为六个生物学背景。此外，GEN 整合了 10 个批量数据集在表达、RNA 编辑和可变剪接方面的全转录组谱，为用户在转录和转录后水平进行综合分析提供了机会。此外，GEN 提供了丰富的基因注释，这些注释是基于对转录组谱的增值策展，还提供了数据分析和可视化的在线服务。总之，GEN 提供了多个物种的综合转录组谱集，因此是更好地理解从组织到细胞的遗传调控结构和功能机制的基本资源。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6066/8728231/f3aead3616fb/gkab878fig1.jpg

相似文献

Gene Expression Nebulas (GEN): a comprehensive data portal integrating transcriptomic profiles across multiple species at both bulk and single-cell levels.

Nucleic Acids Res. 2022 Jan 7;50(D1):D1016-D1024. doi: 10.1093/nar/gkab878.

CROST: a comprehensive repository of spatial transcriptomics.

Nucleic Acids Res. 2024 Jan 5;52(D1):D882-D890. doi: 10.1093/nar/gkad782.

scPlantDB: a comprehensive database for exploring cell types and markers of plant cell atlases.

Nucleic Acids Res. 2024 Jan 5;52(D1):D1629-D1638. doi: 10.1093/nar/gkad706.

LncExpDB: an expression database of human long non-coding RNAs.

Nucleic Acids Res. 2021 Jan 8;49(D1):D962-D968. doi: 10.1093/nar/gkaa850.

HGD: an integrated homologous gene database across multiple species.

Nucleic Acids Res. 2023 Jan 6;51(D1):D994-D1002. doi: 10.1093/nar/gkac970.

Plant Omics Data Center: an integrated web repository for interspecies gene expression networks with NLP-based curation.

Plant Cell Physiol. 2015 Jan;56(1):e9. doi: 10.1093/pcp/pcu188. Epub 2014 Dec 11.

PlantExpress: A Database Integrating OryzaExpress and ArthaExpress for Single-species and Cross-species Gene Expression Network Analyses with Microarray-Based Transcriptome Data.

Plant Cell Physiol. 2017 Jan 1;58(1):e1. doi: 10.1093/pcp/pcw208.

MethBank 4.0: an updated database of DNA methylation across a variety of species.

Nucleic Acids Res. 2023 Jan 6;51(D1):D208-D216. doi: 10.1093/nar/gkac969.

LncBook 2.0: integrating human long non-coding RNAs with multi-omics annotations.

Nucleic Acids Res. 2023 Jan 6;51(D1):D186-D191. doi: 10.1093/nar/gkac999.

Regeneration Roadmap: database resources for regenerative biology.

Nucleic Acids Res. 2022 Jan 7;50(D1):D1085-D1090. doi: 10.1093/nar/gkab870.

引用本文的文献

Artificial Intelligence-Driven Drug Toxicity Prediction: Advances, Challenges, and Future Directions.

Toxics. 2025 Jun 23;13(7):525. doi: 10.3390/toxics13070525.

Computational methods and data resources for predicting tumor neoantigens.

Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf302.

Editome Disease Knowledgebase v2.0: an updated resource of editome-disease associations through literature curation and integrative analysis.

Bioinform Adv. 2025 Jan 25;5(1):vbaf012. doi: 10.1093/bioadv/vbaf012. eCollection 2025.

Machine Learning-Enabled Drug-Induced Toxicity Prediction.

Adv Sci (Weinh). 2025 Apr;12(16):e2413405. doi: 10.1002/advs.202413405. Epub 2025 Feb 3.

GTO: a comprehensive gene therapy omnibus.

Nucleic Acids Res. 2025 Jan 6;53(D1):D1393-D1403. doi: 10.1093/nar/gkae1051.

CircaKB: a comprehensive knowledgebase of circadian genes across multiple species.

Nucleic Acids Res. 2025 Jan 6;53(D1):D67-D78. doi: 10.1093/nar/gkae817.

CBGDA: a manually curated resource for gene-disease associations based on genome-wide CRISPR.

Database (Oxford). 2024 Aug 30;2024. doi: 10.1093/database/baae077.

mosaicMPI: a framework for modular data integration across cohorts and -omics modalities.

Nucleic Acids Res. 2024 Jul 8;52(12):e53. doi: 10.1093/nar/gkae442.

DeepFGRN: inference of gene regulatory network with regulation type based on directed graph embedding.

Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae143.

Plant genomic resources at National Genomics Data Center: assisting in data-driven breeding applications.

aBIOTECH. 2024 Feb 2;5(1):94-106. doi: 10.1007/s42994-023-00134-4. eCollection 2024 Mar.

本文引用的文献

Database Resources of the National Genomics Data Center, China National Center for Bioinformation in 2023.

Nucleic Acids Res. 2023 Jan 6;51(D1):D18-D28. doi: 10.1093/nar/gkac1073.

clusterProfiler 4.0: A universal enrichment tool for interpreting omics data.

Innovation (Camb). 2021 Jul 1;2(3):100141. doi: 10.1016/j.xinn.2021.100141. eCollection 2021 Aug 28.

The Genome Sequence Archive Family: Toward Explosive Data Growth and Diverse Data Types.

Genomics Proteomics Bioinformatics. 2021 Aug;19(4):578-583. doi: 10.1016/j.gpb.2021.08.001. Epub 2021 Aug 13.

Genetic variation and microRNA targeting of A-to-I RNA editing fine tune human tissue transcriptomes.

Genome Biol. 2021 Mar 9;22(1):77. doi: 10.1186/s13059-021-02287-1.

COVID-19 immune features revealed by a large-scale single-cell transcriptome atlas.

Cell. 2021 Apr 1;184(7):1895-1913.e19. doi: 10.1016/j.cell.2021.01.053. Epub 2021 Feb 3.

The European Nucleotide Archive in 2020.

Nucleic Acids Res. 2021 Jan 8;49(D1):D82-D85. doi: 10.1093/nar/gkaa1028.

Genome Variation Map: a worldwide collection of genome variations across multiple species.

Nucleic Acids Res. 2021 Jan 8;49(D1):D1186-D1191. doi: 10.1093/nar/gkaa1005.

Ensembl 2021.

Nucleic Acids Res. 2021 Jan 8;49(D1):D884-D891. doi: 10.1093/nar/gkaa942.

Database resources of the National Center for Biotechnology Information.

Nucleic Acids Res. 2021 Jan 8;49(D1):D10-D17. doi: 10.1093/nar/gkaa892.

IC4R-2.0: Rice Genome Reannotation Using Massive RNA-seq Data.

Genomics Proteomics Bioinformatics. 2020 Apr;18(2):161-172. doi: 10.1016/j.gpb.2018.12.011. Epub 2020 Jul 16.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

基因表达星云 (GEN)：一个综合性的数据门户，整合了多个物种在 bulk 和单细胞水平的转录组谱。

Gene Expression Nebulas (GEN): a comprehensive data portal integrating transcriptomic profiles across multiple species at both bulk and single-cell levels.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译