Cancer Program, Garvan Institute of Medical Research, 384 Victoria Street, Darlinghurst, NSW 2010, Australia.
Bioinformatics. 2010 Jan 1;26(1):139-40. doi: 10.1093/bioinformatics/btp616. Epub 2009 Nov 11.
It is expected that emerging digital gene expression (DGE) technologies will overtake microarray technologies in the near future for many functional genomics applications. One of the fundamental data analysis tasks, especially for gene expression studies, involves determining whether there is evidence that counts for a transcript or exon are significantly different across experimental conditions. edgeR is a Bioconductor software package for examining differential expression of replicated count data. An overdispersed Poisson model is used to account for both biological and technical variability. Empirical Bayes methods are used to moderate the degree of overdispersion across transcripts, improving the reliability of inference. The methodology can be used even with the most minimal levels of replication, provided at least one phenotype or experimental condition is replicated. The software may have other applications beyond sequencing data, such as proteome peptide count data.
The package is freely available under the LGPL licence from the Bioconductor web site (http://bioconductor.org).
预计新兴的数字基因表达(DGE)技术将在不久的将来在许多功能基因组学应用中取代微阵列技术。其中一个基本的数据分析任务,特别是对于基因表达研究,涉及确定是否有证据表明转录本或外显子的计数在实验条件下有显著差异。edgeR 是 Bioconductor 软件包,用于检查重复计数数据的差异表达。使用过离散泊松模型来解释生物学和技术变异性。经验贝叶斯方法用于调节转录本之间的过度分散程度,提高推理的可靠性。即使在最少的复制水平下,只要至少有一个表型或实验条件被复制,该方法也可以使用。该软件可能有除测序数据以外的其他应用,如蛋白质组肽计数数据。
该软件包可根据 LGPL 许可证免费从 Bioconductor 网站(http://bioconductor.org)获得。