Department of Statistics, University of Wisconsin, Madison, WI 53706, USA.
Bioinformatics. 2013 Apr 15;29(8):1035-43. doi: 10.1093/bioinformatics/btt087. Epub 2013 Feb 21.
MOTIVATION: Messenger RNA expression is important in normal development and differentiation, as well as in manifestation of disease. RNA-seq experiments allow for the identification of differentially expressed (DE) genes and their corresponding isoforms on a genome-wide scale. However, statistical methods are required to ensure that accurate identifications are made. A number of methods exist for identifying DE genes, but far fewer are available for identifying DE isoforms. When isoform DE is of interest, investigators often apply gene-level (count-based) methods directly to estimates of isoform counts. Doing so is not recommended. In short, estimating isoform expression is relatively straightforward for some groups of isoforms, but more challenging for others. This results in estimation uncertainty that varies across isoform groups. Count-based methods were not designed to accommodate this varying uncertainty, and consequently, application of them for isoform inference results in reduced power for some classes of isoforms and increased false discoveries for others. RESULTS: Taking advantage of the merits of empirical Bayesian methods, we have developed EBSeq for identifying DE isoforms in an RNA-seq experiment comparing two or more biological conditions. Results demonstrate substantially improved power and performance of EBSeq for identifying DE isoforms. EBSeq also proves to be a robust approach for identifying DE genes. AVAILABILITY AND IMPLEMENTATION: An R package containing examples and sample datasets is available at http://www.biostat.wisc.edu/kendzior/EBSEQ/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
动机:信使 RNA 表达在正常发育和分化以及疾病表现中都很重要。RNA-seq 实验允许在全基因组范围内鉴定差异表达(DE)基因及其相应的异构体。然而,需要统计方法来确保做出准确的鉴定。有许多方法可用于鉴定 DE 基因,但可用于鉴定 DE 异构体的方法却少得多。当异构体 DE 是研究重点时,研究人员通常直接将基于计数的基因水平方法应用于异构体计数的估计值。不建议这样做。简而言之,对于某些组的异构体,估计异构体表达相对简单,但对于其他异构体则更具挑战性。这导致异构体组之间的估计不确定性不同。基于计数的方法并非专为适应这种变化的不确定性而设计,因此,将其应用于异构体推断会导致某些类别的异构体的功效降低,而其他异构体的假发现增加。
结果:利用经验贝叶斯方法的优点,我们开发了 EBSeq 来鉴定两个或多个生物条件比较的 RNA-seq 实验中的 DE 异构体。结果表明,EBSeq 在鉴定 DE 异构体方面具有显著提高的功效和性能。EBSeq 也被证明是一种用于鉴定 DE 基因的稳健方法。
可用性和实现:一个包含示例和样本数据集的 R 包可在 http://www.biostat.wisc.edu/kendzior/EBSEQ/ 获得。
补充信息:补充数据可在 Bioinformatics 在线获得。
Bioinformatics. 2013-2-21
Bioinformatics. 2014-9-1
Bioinformatics. 2013-7-11
BMC Bioinformatics. 2014-9-10
Bioinformatics. 2018-2-1
BMC Bioinformatics. 2013-8-27
Nucleic Acids Res. 2014-8
Nat Ecol Evol. 2025-8-25
Nat Biotechnol. 2012-12-9
Genome Res. 2012-6-21
Bioinformatics. 2012-5-3
Nat Methods. 2011-9-11
Genome Biol. 2011-8-16
Bioinformatics. 2011-8-8
BMC Bioinformatics. 2011-8-4
Bioinformatics. 2011-8-2