• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于生物学重复次数和文库大小的RNA测序差异基因表达分析的优化

Optimization of an RNA-Seq Differential Gene Expression Analysis Depending on Biological Replicate Number and Library Size.

作者信息

Lamarre Sophie, Frasse Pierre, Zouine Mohamed, Labourdette Delphine, Sainderichin Elise, Hu Guojian, Le Berre-Anton Véronique, Bouzayen Mondher, Maza Elie

机构信息

LISBP, Centre National de la Recherche Scientifique, INRA, INSA, Université de Toulouse, Toulouse, France.

GBF, Université de Toulouse, INRA, Castanet-Tolosan, France.

出版信息

Front Plant Sci. 2018 Feb 14;9:108. doi: 10.3389/fpls.2018.00108. eCollection 2018.

DOI:10.3389/fpls.2018.00108
PMID:29491871
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5817962/
Abstract

RNA-Seq is a widely used technology that allows an efficient genome-wide quantification of gene expressions for, for example, differential expression (DE) analysis. After a brief review of the main issues, methods and tools related to the DE analysis of RNA-Seq data, this article focuses on the impact of both the replicate number and library size in such analyses. While the main drawback of previous relevant studies is the lack of generality, we conducted both an analysis of a two-condition experiment (with eight biological replicates per condition) to compare the results with previous benchmark studies, and a meta-analysis of 17 experiments with up to 18 biological conditions, eight biological replicates and 100 million (M) reads per sample. As a global trend, we concluded that the replicate number has a larger impact than the library size on the power of the DE analysis, except for low-expressed genes, for which both parameters seem to have the same impact. Our study also provides new insights for practitioners aiming to enhance their experimental designs. For instance, by analyzing both the sensitivity and specificity of the DE analysis, we showed that the optimal threshold to control the false discovery rate (FDR) is approximately 2, where r is the replicate number. Furthermore, we showed that the false positive rate (FPR) is rather well controlled by all three studied R packages: , and . We also analyzed the impact of both the replicate number and library size on gene ontology (GO) enrichment analysis. Interestingly, we concluded that increases in the replicate number and library size tend to enhance the sensitivity and specificity, respectively, of the GO analysis. Finally, we recommend to RNA-Seq practitioners the production of a pilot data set to strictly analyze the power of their experimental design, or the use of a public data set, which should be similar to the data set they will obtain. For individuals working on tomato research, on the basis of the meta-analysis, we recommend at least four biological replicates per condition and 20 M reads per sample to be almost sure of obtaining about 1000 DE genes if they exist.

摘要

RNA测序是一种广泛应用的技术,它能够对全基因组范围内的基因表达进行高效定量分析,例如用于差异表达(DE)分析。在简要回顾了与RNA测序数据DE分析相关的主要问题、方法和工具之后,本文重点关注此类分析中重复样本数量和文库大小的影响。虽然先前相关研究的主要缺点是缺乏普遍性,但我们既进行了双条件实验分析(每个条件有8个生物学重复),以便将结果与先前的基准研究进行比较,又对17个实验进行了荟萃分析,这些实验包含多达18个生物学条件、8个生物学重复且每个样本有1亿(M)条 reads。作为一个总体趋势,我们得出结论,除了低表达基因外,重复样本数量对DE分析功效的影响比文库大小更大,对于低表达基因,这两个参数似乎具有相同的影响。我们的研究还为旨在改进实验设计的从业者提供了新的见解。例如,通过分析DE分析的敏感性和特异性,我们表明控制错误发现率(FDR)的最佳阈值约为2,其中r是重复样本数量。此外,我们表明所有三个研究的R包: 、 和 对误报率(FPR)的控制相当良好。我们还分析了重复样本数量和文库大小对基因本体(GO)富集分析的影响。有趣的是,我们得出结论,重复样本数量和文库大小的增加往往分别提高GO分析的敏感性和特异性。最后,我们建议RNA测序从业者生成一个试点数据集,以严格分析其实验设计的功效,或者使用一个公共数据集,该数据集应与他们将获得的数据集相似。对于从事番茄研究的人员,基于荟萃分析,我们建议每个条件至少有四个生物学重复且每个样本有20M条reads,以便几乎肯定能获得约1000个DE基因(如果存在的话)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b06d/5817962/1d6d191d8a45/fpls-09-00108-g0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b06d/5817962/2e3b19c7cea7/fpls-09-00108-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b06d/5817962/211f7a77ab81/fpls-09-00108-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b06d/5817962/df86c3974588/fpls-09-00108-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b06d/5817962/0114aec3b9df/fpls-09-00108-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b06d/5817962/8db3678636cc/fpls-09-00108-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b06d/5817962/8889887b7895/fpls-09-00108-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b06d/5817962/71953ae97c4c/fpls-09-00108-g0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b06d/5817962/1d6d191d8a45/fpls-09-00108-g0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b06d/5817962/2e3b19c7cea7/fpls-09-00108-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b06d/5817962/211f7a77ab81/fpls-09-00108-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b06d/5817962/df86c3974588/fpls-09-00108-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b06d/5817962/0114aec3b9df/fpls-09-00108-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b06d/5817962/8db3678636cc/fpls-09-00108-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b06d/5817962/8889887b7895/fpls-09-00108-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b06d/5817962/71953ae97c4c/fpls-09-00108-g0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b06d/5817962/1d6d191d8a45/fpls-09-00108-g0008.jpg

相似文献

1
Optimization of an RNA-Seq Differential Gene Expression Analysis Depending on Biological Replicate Number and Library Size.基于生物学重复次数和文库大小的RNA测序差异基因表达分析的优化
Front Plant Sci. 2018 Feb 14;9:108. doi: 10.3389/fpls.2018.00108. eCollection 2018.
2
An evaluation of RNA-seq differential analysis methods.RNA-seq 差异分析方法评估。
PLoS One. 2022 Sep 16;17(9):e0264246. doi: 10.1371/journal.pone.0264246. eCollection 2022.
3
How many biological replicates are needed in an RNA-seq experiment and which differential expression tool should you use?RNA测序实验需要多少生物学重复,以及应该使用哪种差异表达工具?
RNA. 2016 Jun;22(6):839-51. doi: 10.1261/rna.053959.115. Epub 2016 Mar 28.
4
Power analysis and sample size estimation for RNA-Seq differential expression.RNA测序差异表达的功效分析与样本量估计
RNA. 2014 Nov;20(11):1684-96. doi: 10.1261/rna.046011.114. Epub 2014 Sep 22.
5
Differentially expressed genes from RNA-Seq and functional enrichment results are affected by the choice of single-end versus paired-end reads and stranded versus non-stranded protocols.来自RNA测序的差异表达基因和功能富集结果受单端读段与双端读段以及链特异性与非链特异性方案选择的影响。
BMC Genomics. 2017 May 23;18(1):399. doi: 10.1186/s12864-017-3797-0.
6
Error estimates for the analysis of differential expression from RNA-seq count data.RNA-seq 计数数据差异表达分析的误差估计。
PeerJ. 2014 Sep 23;2:e576. doi: 10.7717/peerj.576. eCollection 2014.
7
A comparison of per sample global scaling and per gene normalization methods for differential expression analysis of RNA-seq data.用于RNA测序数据差异表达分析的每个样本全局缩放和每个基因归一化方法的比较。
PLoS One. 2017 May 1;12(5):e0176185. doi: 10.1371/journal.pone.0176185. eCollection 2017.
8
Choice of library size normalization and statistical methods for differential gene expression analysis in balanced two-group comparisons for RNA-seq studies.RNA-seq 研究中平衡两组比较差异基因表达分析的库大小标准化和统计方法选择。
BMC Genomics. 2020 Jan 28;21(1):75. doi: 10.1186/s12864-020-6502-7.
9
Sample size calculation while controlling false discovery rate for differential expression analysis with RNA-sequencing experiments.在RNA测序实验的差异表达分析中控制错误发现率时的样本量计算。
BMC Bioinformatics. 2016 Mar 31;17:146. doi: 10.1186/s12859-016-0994-9.
10
Getting the most out of RNA-seq data analysis.充分利用RNA测序数据分析。
PeerJ. 2015 Oct 29;3:e1360. doi: 10.7717/peerj.1360. eCollection 2015.

引用本文的文献

1
AI-Enhanced Transcriptomic Discovery of Druggable Targets and Repurposed Therapies for Huntington's Disease.人工智能助力亨廷顿舞蹈病可成药靶点及新疗法的转录组学发现
Brain Sci. 2025 Aug 14;15(8):865. doi: 10.3390/brainsci15080865.
2
LincRNA-MSTRG.673.2 Promotes Chicken Intramuscular Adipocyte Differentiation by Sponging miR-128-3p.长链非编码RNA-MSTRG.673.2通过靶向miR-128-3p促进鸡肌内脂肪细胞分化
Animals (Basel). 2025 Jun 25;15(13):1879. doi: 10.3390/ani15131879.
3
Hypoxanthine activates PI3K/AKT pathway and lipid metabolism, hallmarks in breast cancer metastasis.

本文引用的文献

1
TomExpress, a unified tomato RNA-Seq platform for visualization of expression data, clustering and correlation networks.TomExpress,一个统一的番茄 RNA-Seq 平台,用于可视化表达数据、聚类和相关网络。
Plant J. 2017 Nov;92(4):727-735. doi: 10.1111/tpj.13711. Epub 2017 Oct 25.
2
Feasibility of sample size calculation for RNA-seq studies.RNA-seq 研究样本量计算的可行性。
Brief Bioinform. 2018 Jul 20;19(4):713-720. doi: 10.1093/bib/bbw144.
3
In Papyro Comparison of TMM (edgeR), RLE (DESeq2), and MRN Normalization Methods for a Simple Two-Conditions-Without-Replicates RNA-Seq Experimental Design.
次黄嘌呤激活PI3K/AKT通路和脂质代谢,这是乳腺癌转移的特征。
Med Oncol. 2025 Jun 16;42(7):263. doi: 10.1007/s12032-025-02829-8.
4
: Characterizing dynamics of cells using single-cell RNA-sequencing.使用单细胞RNA测序来表征细胞动力学。
bioRxiv. 2025 May 22:2025.05.16.654572. doi: 10.1101/2025.05.16.654572.
5
High-resolution transcriptional impact of AIRE: effects of pathogenic variants p.Arg257Ter, p.Cys311Tyr, and polygenic risk variant p.Arg471Cys.自身免疫调节因子(AIRE)的高分辨率转录影响:致病性变体p.Arg257Ter、p.Cys311Tyr和多基因风险变体p.Arg471Cys的作用
Front Immunol. 2025 Apr 22;16:1572789. doi: 10.3389/fimmu.2025.1572789. eCollection 2025.
6
Replicability of bulk RNA-Seq differential expression and enrichment analysis results for small cohort sizes.小样本量时批量RNA测序差异表达及富集分析结果的可重复性
PLoS Comput Biol. 2025 May 5;21(5):e1011630. doi: 10.1371/journal.pcbi.1011630. eCollection 2025 May.
7
Chemical, morphological, and genetic characterization of the floral scent and scent-releasing structures of Gynandropsis gynandra (Cleomaceae, Brassicales).白花菜(白花菜科,十字花目)花香及释香结构的化学、形态学和遗传学特征
Plant Biol (Stuttg). 2025 Aug;27(5):710-724. doi: 10.1111/plb.70011. Epub 2025 Mar 20.
8
Transcriptome Analysis of Human Dermal Cells Infected with Candida auris Identified Unique Pathogenesis/Defensive Mechanisms Particularly Ferroptosis.人皮肤细胞感染耳念珠菌的转录组分析鉴定出独特的发病/防御机制,特别是铁死亡。
Mycopathologia. 2024 Jul 11;189(4):65. doi: 10.1007/s11046-024-00868-9.
9
De novo assembly of transcriptomes and differential gene expression analysis using short-read data from emerging model organisms - a brief guide.利用新兴模式生物的短读长数据进行转录组的从头组装和差异基因表达分析——简要指南
Front Zool. 2024 Jun 20;21(1):17. doi: 10.1186/s12983-024-00538-y.
10
Yeast eIF2A has a minimal role in translation initiation and uORF-mediated translational control in vivo.酵母 eIF2A 在体内翻译起始和 uORF 介导的翻译调控中作用很小。
Elife. 2024 Jan 24;12:RP92916. doi: 10.7554/eLife.92916.
在简单的无重复双条件RNA测序实验设计中TMM(edgeR)、RLE(DESeq2)和MRN标准化方法的纸莎草比较
Front Genet. 2016 Sep 16;7:164. doi: 10.3389/fgene.2016.00164. eCollection 2016.
4
How many biological replicates are needed in an RNA-seq experiment and which differential expression tool should you use?RNA测序实验需要多少生物学重复,以及应该使用哪种差异表达工具?
RNA. 2016 Jun;22(6):839-51. doi: 10.1261/rna.053959.115. Epub 2016 Mar 28.
5
A survey of best practices for RNA-seq data analysis.RNA测序数据分析的最佳实践调查。
Genome Biol. 2016 Jan 26;17:13. doi: 10.1186/s13059-016-0881-8.
6
Comparison of normalization and differential expression analyses using RNA-Seq data from 726 individual Drosophila melanogaster.使用来自726只黑腹果蝇个体的RNA测序数据进行标准化和差异表达分析的比较。
BMC Genomics. 2016 Jan 5;17:28. doi: 10.1186/s12864-015-2353-z.
7
The Overlooked Fact: Fundamental Need for Spike-In Control for Virtually All Genome-Wide Analyses.被忽视的事实:几乎所有全基因组分析对掺入对照的根本需求。
Mol Cell Biol. 2015 Dec 28;36(5):662-7. doi: 10.1128/MCB.00970-14.
8
Comparison of normalization methods for differential gene expression analysis in RNA-Seq experiments: A matter of relative size of studied transcriptomes.RNA测序实验中差异基因表达分析的标准化方法比较:所研究转录组相对大小的问题
Commun Integr Biol. 2013 Nov 1;6(6):e25849. doi: 10.4161/cib.25849. Epub 2013 Jul 30.
9
Dynamics in Transcriptomics: Advancements in RNA-seq Time Course and Downstream Analysis.转录组学中的动力学:RNA测序时间进程及下游分析的进展
Comput Struct Biotechnol J. 2015 Aug 24;13:469-77. doi: 10.1016/j.csbj.2015.08.004. eCollection 2015.
10
Statistical models for RNA-seq data derived from a two-condition 48-replicate experiment.源自双条件48次重复实验的RNA测序数据的统计模型。
Bioinformatics. 2015 Nov 15;31(22):3625-30. doi: 10.1093/bioinformatics/btv425. Epub 2015 Jul 23.