针对生物变异的多因素 RNA-Seq 实验的差异表达分析。

Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation.

机构信息

Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Victoria 3052, Australia.

出版信息

Nucleic Acids Res. 2012 May;40(10):4288-97. doi: 10.1093/nar/gks042. Epub 2012 Jan 28.

DOI:10.1093/nar/gks042

PMID:22287627

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3378882/

Abstract

A flexible statistical framework is developed for the analysis of read counts from RNA-Seq gene expression studies. It provides the ability to analyse complex experiments involving multiple treatment conditions and blocking variables while still taking full account of biological variation. Biological variation between RNA samples is estimated separately from the technical variation associated with sequencing technologies. Novel empirical Bayes methods allow each gene to have its own specific variability, even when there are relatively few biological replicates from which to estimate such variability. The pipeline is implemented in the edgeR package of the Bioconductor project. A case study analysis of carcinoma data demonstrates the ability of generalized linear model methods (GLMs) to detect differential expression in a paired design, and even to detect tumour-specific expression changes. The case study demonstrates the need to allow for gene-specific variability, rather than assuming a common dispersion across genes or a fixed relationship between abundance and variability. Genewise dispersions de-prioritize genes with inconsistent results and allow the main analysis to focus on changes that are consistent between biological replicates. Parallel computational approaches are developed to make non-linear model fitting faster and more reliable, making the application of GLMs to genomic data more convenient and practical. Simulations demonstrate the ability of adjusted profile likelihood estimators to return accurate estimators of biological variability in complex situations. When variation is gene-specific, empirical Bayes estimators provide an advantageous compromise between the extremes of assuming common dispersion or separate genewise dispersion. The methods developed here can also be applied to count data arising from DNA-Seq applications, including ChIP-Seq for epigenetic marks and DNA methylation analyses.

摘要

我们开发了一个灵活的统计框架，用于分析 RNA-Seq 基因表达研究中的读取计数。它提供了分析涉及多个处理条件和阻断变量的复杂实验的能力，同时仍然充分考虑了生物学变异。从与测序技术相关的技术变异中分别估计 RNA 样本之间的生物学变异。新的经验贝叶斯方法允许每个基因都有其自己的特定可变性，即使从其中估计这种可变性的生物学重复相对较少。该流水线在 Bioconductor 项目的 edgeR 包中实现。对癌数据的案例研究分析表明，广义线性模型方法（GLMs）能够在配对设计中检测差异表达，甚至能够检测肿瘤特异性表达变化。该案例研究表明需要允许基因特异性可变性，而不是假设基因之间的共同离散度或丰度和可变性之间的固定关系。基因特异性分散度使不一致结果的基因处于优先级较低的位置，并允许主分析集中在生物重复之间一致的变化上。开发了并行计算方法来使非线性模型拟合更快更可靠，从而使 GLMs 更方便实用地应用于基因组数据。模拟表明，调整后的似然比估计量能够在复杂情况下返回生物变异性的准确估计。当变异是基因特异性时，经验贝叶斯估计器在假设共同离散度或单独基因特异性离散度之间提供了有利的折衷。这里开发的方法也可以应用于源自 DNA-Seq 应用的计数数据，包括用于表观遗传标记和 DNA 甲基化分析的 ChIP-Seq。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3e9/3378882/fdfaa61455e1/gks042f1.jpg

相似文献

Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation.针对生物变异的多因素 RNA-Seq 实验的差异表达分析。

Nucleic Acids Res. 2012 May;40(10):4288-97. doi: 10.1093/nar/gks042. Epub 2012 Jan 28.

No counts, no variance: allowing for loss of degrees of freedom when assessing biological variability from RNA-seq data.无计数，无方差：评估RNA测序数据的生物学变异性时考虑自由度损失。

Stat Appl Genet Mol Biol. 2017 Apr 25;16(2):83-93. doi: 10.1515/sagmb-2017-0010.

BADGE: a novel Bayesian model for accurate abundance quantification and differential analysis of RNA-Seq data.标记：一种用于 RNA-Seq 数据精确丰度定量和差异分析的新型贝叶斯模型。

BMC Bioinformatics. 2014;15 Suppl 9(Suppl 9):S6. doi: 10.1186/1471-2105-15-S9-S6. Epub 2014 Sep 10.

A flexible count data model to fit the wide diversity of expression profiles arising from extensively replicated RNA-seq experiments.一种灵活的计数数据模型，可适用于广泛复制的 RNA-seq 实验所产生的广泛多样化的表达谱。

BMC Bioinformatics. 2013 Aug 21;14:254. doi: 10.1186/1471-2105-14-254.

It's DE-licious: A Recipe for Differential Expression Analyses of RNA-seq Experiments Using Quasi-Likelihood Methods in edgeR.美味无比：使用edgeR中拟似然方法进行RNA测序实验差异表达分析的方法

Methods Mol Biol. 2016;1418:391-416. doi: 10.1007/978-1-4939-3578-9_19.

Statistical detection of differentially expressed genes based on RNA-seq: from biological to phylogenetic replicates.基于 RNA-seq 的差异表达基因的统计检测：从生物学重复到系统发育重复。

Brief Bioinform. 2016 Mar;17(2):243-8. doi: 10.1093/bib/bbv035. Epub 2015 Jun 24.

voom: Precision weights unlock linear model analysis tools for RNA-seq read counts.voom：精确权重为RNA测序读数计数解锁线性模型分析工具。

Genome Biol. 2014 Feb 3;15(2):R29. doi: 10.1186/gb-2014-15-2-r29.

Comparison of normalization and differential expression analyses using RNA-Seq data from 726 individual Drosophila melanogaster.使用来自726只黑腹果蝇个体的RNA测序数据进行标准化和差异表达分析的比较。

BMC Genomics. 2016 Jan 5;17:28. doi: 10.1186/s12864-015-2353-z.

PLNseq: a multivariate Poisson lognormal distribution for high-throughput matched RNA-sequencing read count data.PLNseq：一种用于高通量匹配RNA测序读数计数数据的多元泊松对数正态分布。

Stat Med. 2015 Apr 30;34(9):1577-89. doi: 10.1002/sim.6449. Epub 2015 Jan 30.

From reads to genes to pathways: differential expression analysis of RNA-Seq experiments using Rsubread and the edgeR quasi-likelihood pipeline.从 reads 到基因再到通路：使用 Rsubread 和 edgeR 拟似然方法对 RNA-Seq 实验进行差异表达分析

F1000Res. 2016 Jun 20;5:1438. doi: 10.12688/f1000research.8987.2. eCollection 2016.

引用本文的文献

Differential Chromatin Accessibility, Gene Expression, and mRNA Splicing Between Developing Cochlear Inner and Outer Hair Cells.发育中的耳蜗内、外毛细胞之间的染色质可及性、基因表达和mRNA剪接差异

J Assoc Res Otolaryngol. 2025 Sep 5. doi: 10.1007/s10162-025-01005-z.

Cell-cell communication as underlying principle governing color pattern formation in fishes.细胞间通讯作为鱼类体色模式形成的潜在原理。

bioRxiv. 2025 Aug 25:2025.08.21.671633. doi: 10.1101/2025.08.21.671633.

Spatial colocalization and molecular crosstalk of myofibroblastic CAFs and tumor cells shape lymph node metastasis in oral squamous cell carcinoma.肌成纤维细胞癌相关成纤维细胞（CAFs）与肿瘤细胞的空间共定位和分子串扰塑造了口腔鳞状细胞癌的淋巴结转移。

PLoS Genet. 2025 Sep 4;21(9):e1011791. doi: 10.1371/journal.pgen.1011791. eCollection 2025 Sep.

Commensal yeast promotes Salmonella Typhimurium virulence.共生酵母促进鼠伤寒沙门氏菌的毒力。

Nature. 2025 Sep 3. doi: 10.1038/s41586-025-09415-y.

De novo transcriptome analysis and functional annotation of Silybum Marianum L. under drought stress with a focus on Silymarin synthesis and MAPK signaling pathways.干旱胁迫下药用植物水飞蓟的从头转录组分析与功能注释：聚焦水飞蓟素合成及丝裂原活化蛋白激酶信号通路

BMC Plant Biol. 2025 Aug 28;25(1):1150. doi: 10.1186/s12870-025-07272-5.

A mouse organoid platform for modeling cerebral cortex development and cis-regulatory evolution in vitro.一种用于在体外模拟大脑皮质发育和顺式调控进化的小鼠类器官平台。

Dev Cell. 2025 Aug 25. doi: 10.1016/j.devcel.2025.08.001.

MALAT1 Expression Is Deregulated in miR-34a Knockout Cell Lines.MALAT1在miR-34a基因敲除细胞系中的表达失调。

Noncoding RNA. 2025 Aug 5;11(4):60. doi: 10.3390/ncrna11040060.

DEAD-Box Helicase 3 Modulates the Non-Coding RNA Pool in Ribonucleoprotein Condensates During Stress Granule Formation.死亡盒解旋酶3在应激颗粒形成过程中调节核糖核蛋白凝聚物中的非编码RNA库。

Noncoding RNA. 2025 Aug 1;11(4):59. doi: 10.3390/ncrna11040059.

Next-generation RNA sequencing of spatially mapped material from human body donors for testing the impact of fetal environments on the liver transcriptome.对来自人体供体的空间定位材料进行下一代RNA测序，以测试胎儿环境对肝脏转录组的影响。

Sci Rep. 2025 Aug 26;15(1):31353. doi: 10.1038/s41598-025-16432-4.

Jingfang granule extends lifespan and healthspan in Caenorhabditis elegans: insights from RNA-seq analysis of genetic mechanisms.荆防颗粒延长秀丽隐杆线虫的寿命和健康寿命：基于遗传机制RNA测序分析的见解

Biogerontology. 2025 Aug 23;26(5):168. doi: 10.1007/s10522-025-10311-1.

本文引用的文献

Removing technical variability in RNA-seq data using conditional quantile normalization.使用条件分位数归一化去除 RNA-seq 数据中的技术变异性。

Biostatistics. 2012 Apr;13(2):204-16. doi: 10.1093/biostatistics/kxr054. Epub 2012 Jan 27.

A powerful and flexible approach to the analysis of RNA sequence count data.一种强大而灵活的 RNA 序列计数数据分析方法。

Bioinformatics. 2011 Oct 1;27(19):2672-8. doi: 10.1093/bioinformatics/btr449. Epub 2011 Aug 2.

Sequencing technology does not eliminate biological variability.测序技术并不能消除生物变异性。

Nat Biotechnol. 2011 Jul 11;29(7):572-3. doi: 10.1038/nbt.1910.

RNA sequencing: advances, challenges and opportunities.RNA 测序：进展、挑战和机遇。

Nat Rev Genet. 2011 Feb;12(2):87-98. doi: 10.1038/nrg2934. Epub 2010 Dec 30.

From RNA-seq reads to differential expression results.从 RNA-seq 读取到差异表达结果。

Genome Biol. 2010;11(12):220. doi: 10.1186/gb-2010-11-12-220. Epub 2010 Dec 22.

Evaluation of affinity-based genome-wide DNA methylation data: effects of CpG density, amplification bias, and copy number variation.基于亲和力的全基因组 DNA 甲基化数据评估：CpG 密度、扩增偏倚和拷贝数变异的影响。

Genome Res. 2010 Dec;20(12):1719-29. doi: 10.1101/gr.110601.110. Epub 2010 Nov 2.

Gene expression profiling of human breast tissue samples using SAGE-Seq.使用 SAGE-Seq 对人乳腺组织样本进行基因表达谱分析。

Genome Res. 2010 Dec;20(12):1730-9. doi: 10.1101/gr.108217.110. Epub 2010 Nov 2.

Differential expression analysis for sequence count data.差异表达分析序列计数数据。

Genome Biol. 2010;11(10):R106. doi: 10.1186/gb-2010-11-10-r106. Epub 2010 Oct 27.

Quantitative comparison of genome-wide DNA methylation mapping technologies.全基因组 DNA 甲基化图谱技术的定量比较。

Nat Biotechnol. 2010 Oct;28(10):1106-14. doi: 10.1038/nbt.1681. Epub 2010 Sep 19.

Cloud-scale RNA-sequencing differential expression analysis with Myrna.利用 Myrna 进行云规模 RNA-seq 差异表达分析。

Genome Biol. 2010;11(8):R83. doi: 10.1186/gb-2010-11-8-r83. Epub 2010 Aug 11.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

针对生物变异的多因素 RNA-Seq 实验的差异表达分析。

Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献