• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

大规模挖掘人类和小鼠公共可用的 RNA-seq 数据。

Massive mining of publicly available RNA-seq data from human and mouse.

机构信息

Department of Pharmacological Sciences; Mount Sinai Center for Bioinformatics; Big Data to Knowledge, Library of Integrated Network-based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC); Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY, 10029, USA.

出版信息

Nat Commun. 2018 Apr 10;9(1):1366. doi: 10.1038/s41467-018-03751-6.

DOI:10.1038/s41467-018-03751-6
PMID:29636450
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5893633/
Abstract

RNA sequencing (RNA-seq) is the leading technology for genome-wide transcript quantification. However, publicly available RNA-seq data is currently provided mostly in raw form, a significant barrier for global and integrative retrospective analyses. ARCHS4 is a web resource that makes the majority of published RNA-seq data from human and mouse available at the gene and transcript levels. For developing ARCHS4, available FASTQ files from RNA-seq experiments from the Gene Expression Omnibus (GEO) were aligned using a cloud-based infrastructure. In total 187,946 samples are accessible through ARCHS4 with 103,083 mouse and 84,863 human. Additionally, the ARCHS4 web interface provides intuitive exploration of the processed data through querying tools, interactive visualization, and gene pages that provide average expression across cell lines and tissues, top co-expressed genes for each gene, and predicted biological functions and protein-protein interactions for each gene based on prior knowledge combined with co-expression.

摘要

RNA 测序(RNA-seq)是全基因组转录物定量的领先技术。然而,目前公开可用的 RNA-seq 数据主要以原始形式提供,这是全球和综合回顾性分析的一个重大障碍。ARCHS4 是一个网络资源,可提供人类和小鼠的大多数已发表的 RNA-seq 数据,可在基因和转录本水平上使用。为了开发 ARCHS4,使用基于云的基础架构对来自基因表达综合数据库(GEO)的 RNA-seq 实验的可用 FASTQ 文件进行了对齐。通过 ARCHS4 可访问 187946 个样本,其中包括 103083 个小鼠样本和 84863 个人类样本。此外,ARCHS4 网络界面通过查询工具、交互式可视化和基因页面提供经过处理的数据的直观探索,这些工具提供了跨细胞系和组织的平均表达、每个基因的顶级共表达基因以及基于先前知识与共表达相结合的每个基因的预测生物学功能和蛋白质-蛋白质相互作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b536/5893633/c463d4e4fac0/41467_2018_3751_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b536/5893633/f68f4c34b035/41467_2018_3751_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b536/5893633/aed3f0d50baa/41467_2018_3751_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b536/5893633/7f2d43a129f3/41467_2018_3751_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b536/5893633/cd4912d6ccb8/41467_2018_3751_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b536/5893633/c90b84d29b8f/41467_2018_3751_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b536/5893633/c463d4e4fac0/41467_2018_3751_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b536/5893633/f68f4c34b035/41467_2018_3751_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b536/5893633/aed3f0d50baa/41467_2018_3751_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b536/5893633/7f2d43a129f3/41467_2018_3751_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b536/5893633/cd4912d6ccb8/41467_2018_3751_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b536/5893633/c90b84d29b8f/41467_2018_3751_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b536/5893633/c463d4e4fac0/41467_2018_3751_Fig6_HTML.jpg

相似文献

1
Massive mining of publicly available RNA-seq data from human and mouse.大规模挖掘人类和小鼠公共可用的 RNA-seq 数据。
Nat Commun. 2018 Apr 10;9(1):1366. doi: 10.1038/s41467-018-03751-6.
2
Transcriptator: An Automated Computational Pipeline to Annotate Assembled Reads and Identify Non Coding RNA.转录器:一种用于注释组装读段和识别非编码RNA的自动化计算流程。
PLoS One. 2015 Nov 18;10(11):e0140268. doi: 10.1371/journal.pone.0140268. eCollection 2015.
3
TomExpress, a unified tomato RNA-Seq platform for visualization of expression data, clustering and correlation networks.TomExpress,一个统一的番茄 RNA-Seq 平台,用于可视化表达数据、聚类和相关网络。
Plant J. 2017 Nov;92(4):727-735. doi: 10.1111/tpj.13711. Epub 2017 Oct 25.
4
SPAR: small RNA-seq portal for analysis of sequencing experiments.SPAR:用于分析测序实验的小型 RNA-seq 门户。
Nucleic Acids Res. 2018 Jul 2;46(W1):W36-W42. doi: 10.1093/nar/gky330.
5
PrismEXP: gene annotation prediction from stratified gene-gene co-expression matrices.PrismEXP:基于分层基因-基因共表达矩阵的基因注释预测。
PeerJ. 2023 Feb 27;11:e14927. doi: 10.7717/peerj.14927. eCollection 2023.
6
SFGD: a comprehensive platform for mining functional information from soybean transcriptome data and its use in identifying acyl-lipid metabolism pathways.SFGD:一个用于从大豆转录组数据中挖掘功能信息及其在鉴定酰基脂质代谢途径中的应用的综合平台。
BMC Genomics. 2014 Apr 8;15:271. doi: 10.1186/1471-2164-15-271.
7
CLIPdb: a CLIP-seq database for protein-RNA interactions.CLIPdb:一个用于蛋白质-RNA相互作用的CLIP-seq数据库。
BMC Genomics. 2015 Feb 5;16(1):51. doi: 10.1186/s12864-015-1273-2.
8
GigaTON: an extensive publicly searchable database providing a new reference transcriptome in the pacific oyster Crassostrea gigas.千兆吨数据库:一个可公开广泛搜索的数据库,提供太平洋牡蛎(Crassostrea gigas)的新参考转录组。
BMC Bioinformatics. 2015 Dec 2;16:401. doi: 10.1186/s12859-015-0833-4.
9
The RNASeq-er API-a gateway to systematically updated analysis of public RNA-seq data.RNASeq-er API-系统更新公共 RNA-seq 数据分析的门户。
Bioinformatics. 2017 Jul 15;33(14):2218-2220. doi: 10.1093/bioinformatics/btx143.
10
scNPF: an integrative framework assisted by network propagation and network fusion for preprocessing of single-cell RNA-seq data.scNPF:一种基于网络传播和网络融合的综合框架,用于单细胞 RNA-seq 数据的预处理。
BMC Genomics. 2019 May 8;20(1):347. doi: 10.1186/s12864-019-5747-5.

引用本文的文献

1
Selective targeting of TBXT with DARPins identifies regulatory networks and therapeutic vulnerabilities in chordoma.用抗肌动蛋白重复结构域蛋白选择性靶向TBXT可确定脊索瘤中的调控网络和治疗弱点。
Sci Adv. 2025 Sep 5;11(36):eadu2796. doi: 10.1126/sciadv.adu2796. Epub 2025 Sep 3.
2
Hypoxic stress incites HIF1α-driven ribosome biogenesis that can be exploited by targeting RNA Polymerase I.缺氧应激会引发由缺氧诱导因子1α(HIF1α)驱动的核糖体生物合成,这一过程可通过靶向RNA聚合酶I来加以利用。
Nat Commun. 2025 Aug 27;16(1):8018. doi: 10.1038/s41467-025-63315-3.
3
The Data Distillery: A Graph Framework for Semantic Integration and Querying of Biomedical Data.

本文引用的文献

1
Clustergrammer, a web-based heatmap visualization and analysis tool for high-dimensional biological data.Clustergrammer,一个基于网络的高维生物数据热图可视化和分析工具。
Sci Data. 2017 Oct 10;4:170151. doi: 10.1038/sdata.2017.151.
2
Integration of over 9,000 mass spectrometry experiments builds a global map of human protein complexes.整合9000多个质谱实验构建了人类蛋白质复合物的全局图谱。
Mol Syst Biol. 2017 Jun 8;13(6):932. doi: 10.15252/msb.20167490.
3
Toil enables reproducible, open source, big biomedical data analyses.Toil支持可重复的、开源的大型生物医学数据分析。
数据提炼:用于生物医学数据语义集成与查询的图形框架
bioRxiv. 2025 Aug 15:2025.08.11.666099. doi: 10.1101/2025.08.11.666099.
4
ChEA-KG: Human Transcription Factor Regulatory Network with a Knowledge Graph Interactive User Interface.ChEA知识图谱:具有知识图谱交互式用户界面的人类转录因子调控网络。
bioRxiv. 2025 Aug 12:2025.08.09.669505. doi: 10.1101/2025.08.09.669505.
5
Application of perturbation gene expression profiles in drug discovery-From mechanism of action to quantitative modelling.扰动基因表达谱在药物发现中的应用——从作用机制到定量建模
Front Syst Biol. 2023 Feb 9;3:1126044. doi: 10.3389/fsysb.2023.1126044. eCollection 2023.
6
Multiple oestradiol functions inhibit ferroptosis and acute kidney injury.多种雌二醇功能可抑制铁死亡和急性肾损伤。
Nature. 2025 Aug 13. doi: 10.1038/s41586-025-09389-x.
7
A Meta-Analysis of the Effects of Acute Sleep Deprivation on the Cortical Transcriptome in Rodent Models.急性睡眠剥夺对啮齿动物模型皮质转录组影响的荟萃分析。
bioRxiv. 2025 Aug 2:2025.04.21.648791. doi: 10.1101/2025.04.21.648791.
8
Triple-effect correction for Cell Painting data with contrastive and domain-adversarial learning.基于对比学习和域对抗学习的细胞绘画数据三重效应校正
Nat Commun. 2025 Jul 25;16(1):6886. doi: 10.1038/s41467-025-62193-z.
9
Enhancement of activation-induced T cell proliferation by SIRPG in a CD47-independent manner.SIRPG以不依赖CD47的方式增强激活诱导的T细胞增殖。
bioRxiv. 2025 May 7:2025.05.01.651731. doi: 10.1101/2025.05.01.651731.
10
BSA-LDHs-cGAMP In Situ Sensitization of Osteosarcoma: Enhancing Antitumor Efficacy with CD40 Agonist Antibodies.牛血清白蛋白层状双氢氧化物原位致敏骨肉瘤:用CD40激动剂抗体增强抗肿瘤疗效
ACS Biomater Sci Eng. 2025 Aug 11;11(8):4931-4940. doi: 10.1021/acsbiomaterials.5c00333. Epub 2025 Jul 13.
Nat Biotechnol. 2017 Apr 11;35(4):314-316. doi: 10.1038/nbt.3772.
4
Reproducible RNA-seq analysis using recount2.使用recount2进行可重复的RNA测序分析。
Nat Biotechnol. 2017 Apr 11;35(4):319-321. doi: 10.1038/nbt.3838.
5
The harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins.Harmonizome数据库:一组经过处理的数据集,用于提供和挖掘有关基因和蛋白质的知识。
Database (Oxford). 2016 Jul 3;2016. doi: 10.1093/database/baw100. Print 2016.
6
Enrichr: a comprehensive gene set enrichment analysis web server 2016 update.Enrichr:一个全面的基因集富集分析网络服务器2016年更新版。
Nucleic Acids Res. 2016 Jul 8;44(W1):W90-7. doi: 10.1093/nar/gkw377. Epub 2016 May 3.
7
Near-optimal probabilistic RNA-seq quantification.近乎最优的概率 RNA-seq 定量。
Nat Biotechnol. 2016 May;34(5):525-7. doi: 10.1038/nbt.3519. Epub 2016 Apr 4.
8
Expression Atlas update--an integrated database of gene and protein expression in humans, animals and plants.表达图谱更新——一个关于人类、动物和植物基因与蛋白质表达的综合数据库。
Nucleic Acids Res. 2016 Jan 4;44(D1):D746-52. doi: 10.1093/nar/gkv1045. Epub 2015 Oct 19.
9
The BioPlex Network: A Systematic Exploration of the Human Interactome.生物互作组网络:对人类相互作用组的系统探索。
Cell. 2015 Jul 16;162(2):425-440. doi: 10.1016/j.cell.2015.06.043.
10
Lean Big Data integration in systems biology and systems pharmacology.系统生物学和系统药理学中的精益大数据整合
Trends Pharmacol Sci. 2014 Sep;35(9):450-60. doi: 10.1016/j.tips.2014.07.001. Epub 2014 Aug 7.