Suppr超能文献

RNA-seq 中的差异表达:深度的问题。

Differential expression in RNA-seq: a matter of depth.

机构信息

Bioinformatics and Genomics Department, Centro de Investigación Príncipe Felipe, 46012 Valencia, Spain.

出版信息

Genome Res. 2011 Dec;21(12):2213-23. doi: 10.1101/gr.124321.111. Epub 2011 Sep 8.

Abstract

Next-generation sequencing (NGS) technologies are revolutionizing genome research, and in particular, their application to transcriptomics (RNA-seq) is increasingly being used for gene expression profiling as a replacement for microarrays. However, the properties of RNA-seq data have not been yet fully established, and additional research is needed for understanding how these data respond to differential expression analysis. In this work, we set out to gain insights into the characteristics of RNA-seq data analysis by studying an important parameter of this technology: the sequencing depth. We have analyzed how sequencing depth affects the detection of transcripts and their identification as differentially expressed, looking at aspects such as transcript biotype, length, expression level, and fold-change. We have evaluated different algorithms available for the analysis of RNA-seq and proposed a novel approach--NOISeq--that differs from existing methods in that it is data-adaptive and nonparametric. Our results reveal that most existing methodologies suffer from a strong dependency on sequencing depth for their differential expression calls and that this results in a considerable number of false positives that increases as the number of reads grows. In contrast, our proposed method models the noise distribution from the actual data, can therefore better adapt to the size of the data set, and is more effective in controlling the rate of false discoveries. This work discusses the true potential of RNA-seq for studying regulation at low expression ranges, the noise within RNA-seq data, and the issue of replication.

摘要

下一代测序(NGS)技术正在彻底改变基因组学研究,特别是它们在转录组学(RNA-seq)中的应用,正越来越多地被用于基因表达谱分析,以替代微阵列。然而,RNA-seq 数据的特性尚未完全确定,需要进一步的研究来了解这些数据如何响应差异表达分析。在这项工作中,我们通过研究该技术的一个重要参数——测序深度,旨在深入了解 RNA-seq 数据分析的特点。我们分析了测序深度如何影响转录本的检测及其作为差异表达的识别,研究了转录本的生物类型、长度、表达水平和倍数变化等方面。我们评估了 RNA-seq 分析的不同算法,并提出了一种新的方法——NOISeq,与现有方法不同的是,它是数据自适应的和非参数的。我们的结果表明,大多数现有的方法在进行差异表达分析时,对测序深度有很强的依赖性,这导致了大量的假阳性,随着读取次数的增加而增加。相比之下,我们提出的方法从实际数据中建模噪声分布,因此可以更好地适应数据集的大小,并且在控制假发现率方面更有效。这项工作讨论了 RNA-seq 在低表达范围的调控研究中的真正潜力、RNA-seq 数据中的噪声以及复制问题。

相似文献

1
Differential expression in RNA-seq: a matter of depth.RNA-seq 中的差异表达:深度的问题。
Genome Res. 2011 Dec;21(12):2213-23. doi: 10.1101/gr.124321.111. Epub 2011 Sep 8.
7
Sequencing transcriptomes in toto.全转录组测序。
Integr Biol (Camb). 2011 May;3(5):522-8. doi: 10.1039/c0ib00062k. Epub 2011 Feb 4.

引用本文的文献

本文引用的文献

3
The genome of Theobroma cacao.可可基因组。
Nat Genet. 2011 Feb;43(2):101-8. doi: 10.1038/ng.736. Epub 2010 Dec 26.
4
The developmental transcriptome of Drosophila melanogaster.黑腹果蝇的发育转录组。
Nature. 2011 Mar 24;471(7339):473-9. doi: 10.1038/nature09715. Epub 2010 Dec 22.
5
From RNA-seq reads to differential expression results.从 RNA-seq 读取到差异表达结果。
Genome Biol. 2010;11(12):220. doi: 10.1186/gb-2010-11-12-220. Epub 2010 Dec 22.
6
The sequence read archive.序列读取存档库。
Nucleic Acids Res. 2011 Jan;39(Database issue):D19-21. doi: 10.1093/nar/gkq1019. Epub 2010 Nov 9.
7
Ensembl 2011.Ensembl 2011年版
Nucleic Acids Res. 2011 Jan;39(Database issue):D800-6. doi: 10.1093/nar/gkq1064. Epub 2010 Nov 2.
9
Differential expression analysis for sequence count data.差异表达分析序列计数数据。
Genome Biol. 2010;11(10):R106. doi: 10.1186/gb-2010-11-10-r106. Epub 2010 Oct 27.
10
Alternative expression analysis by RNA sequencing.RNA 测序的替代表达分析。
Nat Methods. 2010 Oct;7(10):843-7. doi: 10.1038/nmeth.1503. Epub 2010 Sep 12.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验