Suppr超能文献

使用对照基因或样本的因子分析对RNA测序数据进行标准化。

Normalization of RNA-seq data using factor analysis of control genes or samples.

作者信息

Risso Davide, Ngai John, Speed Terence P, Dudoit Sandrine

机构信息

Department of Statistics, University of California, Berkeley, Berkeley, California, USA.

1] Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, California, USA. [2] Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, California, USA. [3] Functional Genomics Laboratory, University of California, Berkeley, Berkeley, California, USA.

出版信息

Nat Biotechnol. 2014 Sep;32(9):896-902. doi: 10.1038/nbt.2931. Epub 2014 Aug 24.

Abstract

Normalization of RNA-sequencing (RNA-seq) data has proven essential to ensure accurate inference of expression levels. Here, we show that usual normalization approaches mostly account for sequencing depth and fail to correct for library preparation and other more complex unwanted technical effects. We evaluate the performance of the External RNA Control Consortium (ERCC) spike-in controls and investigate the possibility of using them directly for normalization. We show that the spike-ins are not reliable enough to be used in standard global-scaling or regression-based normalization procedures. We propose a normalization strategy, called remove unwanted variation (RUV), that adjusts for nuisance technical effects by performing factor analysis on suitable sets of control genes (e.g., ERCC spike-ins) or samples (e.g., replicate libraries). Our approach leads to more accurate estimates of expression fold-changes and tests of differential expression compared to state-of-the-art normalization methods. In particular, RUV promises to be valuable for large collaborative projects involving multiple laboratories, technicians, and/or sequencing platforms.

摘要

RNA测序(RNA-seq)数据的标准化已被证明对于确保准确推断表达水平至关重要。在这里,我们表明,通常的标准化方法大多只考虑了测序深度,而未能校正文库制备及其他更复杂的不必要技术效应。我们评估了外部RNA对照联盟(ERCC)掺入对照的性能,并研究了直接将其用于标准化的可能性。我们表明,掺入对照不够可靠,无法用于标准的全局缩放或基于回归的标准化程序。我们提出了一种称为去除不必要变异(RUV)的标准化策略,该策略通过对合适的对照基因集(例如,ERCC掺入对照)或样本(例如,重复文库)进行因子分析来调整干扰技术效应。与最先进的标准化方法相比,我们的方法能够更准确地估计表达倍数变化并进行差异表达检验。特别是,RUV对于涉及多个实验室、技术人员和/或测序平台的大型合作项目有望具有重要价值。

相似文献

1
Normalization of RNA-seq data using factor analysis of control genes or samples.
Nat Biotechnol. 2014 Sep;32(9):896-902. doi: 10.1038/nbt.2931. Epub 2014 Aug 24.
2
Removing unwanted variation from large-scale RNA sequencing data with PRPS.
Nat Biotechnol. 2023 Jan;41(1):82-95. doi: 10.1038/s41587-022-01440-w. Epub 2022 Sep 15.
3
RUV-III-NB: normalization of single cell RNA-seq data.
Nucleic Acids Res. 2022 Sep 9;50(16):e96. doi: 10.1093/nar/gkac486.
4
mRNA enrichment protocols determine the quantification characteristics of external RNA spike-in controls in RNA-Seq studies.
Sci China Life Sci. 2013 Feb;56(2):134-42. doi: 10.1007/s11427-013-4437-9. Epub 2013 Feb 8.
5
Assessment of Single Cell RNA-Seq Normalization Methods.
G3 (Bethesda). 2017 Jul 5;7(7):2039-2045. doi: 10.1534/g3.117.040683.
8
Internal and external normalization of nascent RNA sequencing run-on experiments.
BMC Bioinformatics. 2024 Jan 12;25(1):19. doi: 10.1186/s12859-023-05607-3.
9
GC-content normalization for RNA-Seq data.
BMC Bioinformatics. 2011 Dec 17;12:480. doi: 10.1186/1471-2105-12-480.
10
MUREN: a robust and multi-reference approach of RNA-seq transcript normalization.
BMC Bioinformatics. 2021 Jul 28;22(1):386. doi: 10.1186/s12859-021-04288-0.

引用本文的文献

1
CDC42-effector interaction inhibitors alter patterns of vessel arborization in skin and tumors .
iScience. 2025 Jul 14;28(7):112971. doi: 10.1016/j.isci.2025.112971. eCollection 2025 Jul 18.
2
RNADecayCafe, a uniformly processed atlas of RNA half-life estimates across multiple human cell lines.
bioRxiv. 2025 Aug 21:2025.08.19.671151. doi: 10.1101/2025.08.19.671151.
3
Toxicogenomic Insights into Environmental Toxicant Exposures: The TaRGET II Resource.
Res Sq. 2025 Aug 20:rs.3.rs-7285514. doi: 10.21203/rs.3.rs-7285514/v1.
5
SARS-CoV-2 infection induces pro-fibrotic and pro-thrombotic foam cell formation.
Nat Microbiol. 2025 Aug 22. doi: 10.1038/s41564-025-02090-9.
8
gSELECT: A novel pre-analysis machine-learning library enabling early hypothesis testing and predictive gene selection in single-cell data.
Comput Struct Biotechnol J. 2025 Aug 5;27:3510-3527. doi: 10.1016/j.csbj.2025.07.047. eCollection 2025.

本文引用的文献

1
Correcting gene expression data when neither the unwanted variation nor the factor of interest are observed.
Biostatistics. 2016 Jan;17(1):16-28. doi: 10.1093/biostatistics/kxv026. Epub 2015 Aug 17.
3
Accounting for technical noise in single-cell RNA-seq experiments.
Nat Methods. 2013 Nov;10(11):1093-5. doi: 10.1038/nmeth.2645. Epub 2013 Sep 22.
4
Reproducibility of high-throughput mRNA and small RNA sequencing across laboratories.
Nat Biotechnol. 2013 Nov;31(11):1015-22. doi: 10.1038/nbt.2702. Epub 2013 Sep 15.
5
The use of miRNA microarrays for the analysis of cancer samples with global miRNA decrease.
RNA. 2013 Jul;19(7):876-88. doi: 10.1261/rna.035055.112. Epub 2013 May 24.
6
mRNA enrichment protocols determine the quantification characteristics of external RNA spike-in controls in RNA-Seq studies.
Sci China Life Sci. 2013 Feb;56(2):134-42. doi: 10.1007/s11427-013-4437-9. Epub 2013 Feb 8.
7
Revisiting global gene expression analysis.
Cell. 2012 Oct 26;151(3):476-82. doi: 10.1016/j.cell.2012.10.012.
8
A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis.
Brief Bioinform. 2013 Nov;14(6):671-83. doi: 10.1093/bib/bbs046. Epub 2012 Sep 17.
9
Systematic comparison of RNA-Seq normalization methods using measurement error models.
Bioinformatics. 2012 Oct 15;28(20):2584-91. doi: 10.1093/bioinformatics/bts497. Epub 2012 Aug 22.
10
Removing technical variability in RNA-seq data using conditional quantile normalization.
Biostatistics. 2012 Apr;13(2):204-16. doi: 10.1093/biostatistics/kxr054. Epub 2012 Jan 27.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验