Suppr
超能文献

概率倍数变化：一种用于识别差异表达基因列表的稳健计算方法。

Probability fold change: a robust computational approach for identifying differentially expressed gene lists.

作者信息

Deng Xutao, Xu Jun, Hui James, Wang Charles

机构信息

Transcriptional Genomics Core, Cedars-Sinai Medical Center, David Geffen School of Medicine at UCLA, Los Angeles, CA 90048, USA.

出版信息

Comput Methods Programs Biomed. 2009 Feb;93(2):124-39. doi: 10.1016/j.cmpb.2008.07.013. Epub 2008 Oct 7.

DOI:10.1016/j.cmpb.2008.07.013

PMID:18842321

Abstract

Identifying genes that are differentially expressed under different experimental conditions is a fundamental task in microarray studies. However, different ranking methods generate very different gene lists, and this could profoundly impact follow-up analyses and biological interpretation. Therefore, developing improved ranking methods are critical in microarray data analysis. We developed a new algorithm, the probabilistic fold change (PFC), which ranks genes based on a confidence interval estimate of fold change. We performed extensive testing using multiple benchmark data sources including the MicroArray Quality Control (MAQC) data sets. We corroborated our observations with MAQC data sets using qRT-PCR data sets and Latin square spike-in data sets. Along with PFC, we tested six other popular ranking algorithms including Mean Fold Change (FC), SAM, t-statistic (T), Bayesian-t (BAYT), Intensity-Conditional Fold Change (CFC), and Rank Product (RP). PFC achieved reproducibility and accuracy that are consistently among the best of the seven ranking algorithms while other ranking algorithms would show weakness in some cases. Contrary to common belief, our results demonstrated that statistical accuracy will not translate to biological reproducibility and therefore both quality aspects need to be evaluated.

摘要

识别在不同实验条件下差异表达的基因是微阵列研究中的一项基本任务。然而，不同的排名方法会产生非常不同的基因列表，这可能会对后续分析和生物学解释产生深远影响。因此，开发改进的排名方法在微阵列数据分析中至关重要。我们开发了一种新算法，概率倍数变化（PFC），它基于倍数变化的置信区间估计对基因进行排名。我们使用包括微阵列质量控制（MAQC）数据集在内的多个基准数据源进行了广泛测试。我们使用qRT-PCR数据集和拉丁方掺入数据集，通过MAQC数据集证实了我们的观察结果。除了PFC，我们还测试了其他六种流行的排名算法，包括平均倍数变化（FC）、SAM、t统计量（T）、贝叶斯t（BAYT）、强度条件倍数变化（CFC）和排名乘积（RP）。PFC在七种排名算法中始终具有最佳的可重复性和准确性，而其他排名算法在某些情况下会表现出弱点。与普遍看法相反，我们的结果表明，统计准确性并不等同于生物学可重复性，因此这两个质量方面都需要评估。

相似文献

Probability fold change: a robust computational approach for identifying differentially expressed gene lists.

Comput Methods Programs Biomed. 2009 Feb;93(2):124-39. doi: 10.1016/j.cmpb.2008.07.013. Epub 2008 Oct 7.

Detecting differentially expressed genes by relative entropy.

J Theor Biol. 2005 Jun 7;234(3):395-402. doi: 10.1016/j.jtbi.2004.11.039. Epub 2005 Jan 24.

Statistical analysis of microarray data: a Bayesian approach.

Biostatistics. 2003 Oct;4(4):597-620. doi: 10.1093/biostatistics/4.4.597.

Bayesian ranking and selection methods using hierarchical mixture models in microarray studies.

Biostatistics. 2010 Apr;11(2):281-9. doi: 10.1093/biostatistics/kxp047. Epub 2009 Nov 27.

A new outlier removal approach for cDNA microarray normalization.

Biotechniques. 2009 Aug;47(2):691-2, 694-700. doi: 10.2144/000113195.

A new method for class prediction based on signed-rank algorithms applied to Affymetrix microarray experiments.

BMC Bioinformatics. 2008 Jan 11;9:16. doi: 10.1186/1471-2105-9-16.

Considerations when using the significance analysis of microarrays (SAM) algorithm.

BMC Bioinformatics. 2005 May 29;6:129. doi: 10.1186/1471-2105-6-129.

Bayesian classification and non-Bayesian label estimation via EM algorithm to identify differentially expressed genes: a comparative study.

Biom J. 2008 Oct;50(5):824-36. doi: 10.1002/bimj.200710468.

A Laplace mixture model for identification of differential expression in microarray experiments.

Biostatistics. 2006 Oct;7(4):630-41. doi: 10.1093/biostatistics/kxj032. Epub 2006 Mar 24.

Selection of differentially expressed genes in microarray data analysis.

Pharmacogenomics J. 2007 Jun;7(3):212-20. doi: 10.1038/sj.tpj.6500412. Epub 2006 Aug 29.

引用本文的文献

Eight potential biomarkers for distinguishing between lung adenocarcinoma and squamous cell carcinoma.

Oncotarget. 2017 May 3;8(42):71759-71771. doi: 10.18632/oncotarget.17606. eCollection 2017 Sep 22.

Probabilistic strain optimization under constraint uncertainty.

BMC Syst Biol. 2013 Mar 29;7:29. doi: 10.1186/1752-0509-7-29.

CDS: a fold-change based statistical test for concomitant identification of distinctness and similarity in gene expression analysis.

Genomics Proteomics Bioinformatics. 2012 Jun;10(3):127-35. doi: 10.1016/j.gpb.2012.06.002. Epub 2012 Jun 25.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

概率倍数变化：一种用于识别差异表达基因列表的稳健计算方法。

Probability fold change: a robust computational approach for identifying differentially expressed gene lists.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译