Illumina 高通量 RNA 测序数据分析中标准化方法的综合评估。

A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis.

机构信息

Institut Pasteur, PF2 Plate-forme Transcriptome et Epigénome, 28 rue du Dr Roux, Paris CEDEX 15, F-75724 France. Tel.: +33 (0) 145688651; Fax: +33 (0) 145688406;

出版信息

Brief Bioinform. 2013 Nov;14(6):671-83. doi: 10.1093/bib/bbs046. Epub 2012 Sep 17.

DOI:10.1093/bib/bbs046

PMID:22988256

Abstract

During the last 3 years, a number of approaches for the normalization of RNA sequencing data have emerged in the literature, differing both in the type of bias adjustment and in the statistical strategy adopted. However, as data continue to accumulate, there has been no clear consensus on the appropriate normalization method to be used or the impact of a chosen method on the downstream analysis. In this work, we focus on a comprehensive comparison of seven recently proposed normalization methods for the differential analysis of RNA-seq data, with an emphasis on the use of varied real and simulated datasets involving different species and experimental designs to represent data characteristics commonly observed in practice. Based on this comparison study, we propose practical recommendations on the appropriate normalization method to be used and its impact on the differential analysis of RNA-seq data.

摘要

在过去的 3 年中，文献中出现了许多用于 RNA 测序数据归一化的方法，这些方法在偏置调整的类型和采用的统计策略上均有所不同。然而，随着数据的不断积累，对于应该使用哪种适当的归一化方法或所选方法对下游分析的影响，仍然没有明确的共识。在这项工作中，我们重点对七种最近提出的 RNA-seq 数据差异分析归一化方法进行了全面比较，特别强调使用不同物种和实验设计的真实和模拟数据集来表示实践中常见的数据特征。基于这项比较研究，我们就应该使用的适当归一化方法及其对 RNA-seq 数据差异分析的影响提出了实用建议。