微阵列分析中倍数变化和 T 检验的双重过滤的详细检查。

A close examination of double filtering with fold change and T test in microarray analysis.

机构信息

Department of Clinical Sciences, University of Texas Southwestern Medical Center, Dallas, Texas, USA.

出版信息

BMC Bioinformatics. 2009 Dec 8;10:402. doi: 10.1186/1471-2105-10-402.

DOI:10.1186/1471-2105-10-402

PMID:19995439

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2801685/

Abstract

BACKGROUND

Many researchers use the double filtering procedure with fold change and t test to identify differentially expressed genes, in the hope that the double filtering will provide extra confidence in the results. Due to its simplicity, the double filtering procedure has been popular with applied researchers despite the development of more sophisticated methods.

RESULTS

This paper, for the first time to our knowledge, provides theoretical insight on the drawback of the double filtering procedure. We show that fold change assumes all genes to have a common variance while t statistic assumes gene-specific variances. The two statistics are based on contradicting assumptions. Under the assumption that gene variances arise from a mixture of a common variance and gene-specific variances, we develop the theoretically most powerful likelihood ratio test statistic. We further demonstrate that the posterior inference based on a Bayesian mixture model and the widely used significance analysis of microarrays (SAM) statistic are better approximations to the likelihood ratio test than the double filtering procedure.

CONCLUSION

We demonstrate through hypothesis testing theory, simulation studies and real data examples, that well constructed shrinkage testing methods, which can be united under the mixture gene variance assumption, can considerably outperform the double filtering procedure.

摘要

背景

许多研究人员使用 fold change 和 t 检验的双重过滤程序来识别差异表达基因，希望双重过滤能够为结果提供额外的信心。尽管出现了更复杂的方法，但由于其简单性，该双重过滤程序一直受到应用研究人员的欢迎。

结果

本文首次从理论上揭示了双重过滤程序的缺陷。我们表明，fold change 假设所有基因具有共同的方差，而 t 统计量假设基因特异性方差。这两个统计量基于相互矛盾的假设。在基因方差由共同方差和基因特异性方差的混合物产生的假设下，我们开发了理论上最强大的似然比检验统计量。我们进一步证明，基于贝叶斯混合模型的后验推断和广泛使用的基因芯片显著性分析 (SAM) 统计量比双重过滤程序更接近似然比检验。

结论

我们通过假设检验理论、模拟研究和真实数据示例证明，构建良好的收缩检验方法可以在混合基因方差假设下统一，并且可以大大优于双重过滤程序。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ec0/2801685/965372c569ea/1471-2105-10-402-1.jpg

相似文献

A close examination of double filtering with fold change and T test in microarray analysis.微阵列分析中倍数变化和 T 检验的双重过滤的详细检查。

BMC Bioinformatics. 2009 Dec 8;10:402. doi: 10.1186/1471-2105-10-402.

Confident difference criterion: a new Bayesian differentially expressed gene selection algorithm with applications.置信差异准则：一种新的贝叶斯差异表达基因选择算法及其应用

BMC Bioinformatics. 2015 Aug 7;16:245. doi: 10.1186/s12859-015-0664-3.

A simple implementation of a normal mixture approach to differential gene expression in multiclass microarrays.一种用于多类微阵列中差异基因表达的正态混合方法的简单实现。

Bioinformatics. 2006 Jul 1;22(13):1608-15. doi: 10.1093/bioinformatics/btl148. Epub 2006 Apr 21.

Comparison of small n statistical tests of differential expression applied to microarrays.应用于微阵列的差异表达小样本量统计检验的比较。

BMC Bioinformatics. 2009 Feb 3;10:45. doi: 10.1186/1471-2105-10-45.

A full Bayesian hierarchical mixture model for the variance of gene differential expression.用于基因差异表达方差的全贝叶斯分层混合模型。

BMC Bioinformatics. 2007 Apr 17;8:124. doi: 10.1186/1471-2105-8-124.

A mixture-model approach for parallel testing for unequal variances.一种用于不等方差平行检验的混合模型方法。

Stat Appl Genet Mol Biol. 2012 Jan 6;11(1):Article 8. doi: 10.2202/1544-6115.1762.

Normal uniform mixture differential gene expression detection for cDNA microarrays.用于cDNA微阵列的正常均匀混合物差异基因表达检测

BMC Bioinformatics. 2005 Jul 12;6:173. doi: 10.1186/1471-2105-6-173.

Comments on: fold change rank ordering statistics: a new method for detecting differentially expressed genes.对《倍数变化排序统计：一种检测差异表达基因的新方法》的评论

BMC Bioinformatics. 2016 Nov 15;17(1):462. doi: 10.1186/s12859-016-1322-0.

An empirical Bayes optimal discovery procedure based on semiparametric hierarchical mixture models.基于半参数层次混合模型的经验贝叶斯最优发现程序。

Comput Math Methods Med. 2013;2013:568480. doi: 10.1155/2013/568480. Epub 2013 Apr 10.

Validation of differential gene expression algorithms: application comparing fold-change estimation to hypothesis testing.差异基因表达算法的验证：应用比较折叠变化估计与假设检验。

BMC Bioinformatics. 2010 Jan 28;11:63. doi: 10.1186/1471-2105-11-63.

引用本文的文献

Explicit Scale Simulation for analysis of RNA-sequencing count data with ALDEx2.使用ALDEx2对RNA测序计数数据进行分析的显式尺度模拟。

NAR Genom Bioinform. 2025 Aug 19;7(3):lqaf108. doi: 10.1093/nargab/lqaf108. eCollection 2025 Sep.

Recalibrating differential gene expression by genetic dosage variance prioritizes functionally relevant genes.通过基因剂量变异重新校准差异基因表达可对功能相关基因进行优先排序。

bioRxiv. 2024 Apr 10:2024.04.10.588830. doi: 10.1101/2024.04.10.588830.

A role of TTI1 in the colorectal cancer by promoting proliferation.TTI1通过促进增殖在结直肠癌中发挥作用。

Transl Cancer Res. 2021 Mar;10(3):1378-1388. doi: 10.21037/tcr-20-3322.

Inflated false discovery rate due to volcano plots: problem and solutions.由于火山图而导致的 inflated false discovery rate：问题与解决方案。

Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab053.

Identification of Key Modules and Hub Genes of Annulus Fibrosus in Intervertebral Disc Degeneration.椎间盘退变中纤维环关键模块和枢纽基因的鉴定

Front Genet. 2021 Jan 27;11:596174. doi: 10.3389/fgene.2020.596174. eCollection 2020.

Transcriptome differences in adipose stromal cells derived from pre- and postmenopausal women.绝经前后妇女来源的脂肪基质细胞的转录组差异。

Stem Cell Res Ther. 2020 Feb 28;11(1):92. doi: 10.1186/s13287-020-01613-x.

Identification of key genes in osteosarcoma by meta‑analysis of gene expression microarray.Meta 分析基因表达谱芯片鉴定骨肉瘤的关键基因

Mol Med Rep. 2019 Oct;20(4):3075-3084. doi: 10.3892/mmr.2019.10543. Epub 2019 Jul 31.

Machine-learning based radiogenomics analysis of MRI features and metagenes in glioblastoma multiforme patients with different survival time.基于机器学习的脑胶质母细胞瘤患者 MRI 特征和元基因与生存时间不同的放射基因组学分析。

J Cell Mol Med. 2019 Jun;23(6):4375-4385. doi: 10.1111/jcmm.14328. Epub 2019 Apr 18.

Temporal Change of Extracellular Matrix during Vein Arterialization Remodeling in Rats.大鼠静脉动脉化重塑过程中细胞外基质的时间变化

J Cardiovasc Dev Dis. 2019 Feb 2;6(1):7. doi: 10.3390/jcdd6010007.

Robust volcano plot: identification of differential metabolites in the presence of outliers.稳健火山图：在存在离群值的情况下鉴定差异代谢物。

BMC Bioinformatics. 2018 Apr 11;19(1):128. doi: 10.1186/s12859-018-2117-2.

本文引用的文献

Bayesian optimal discovery procedure for simultaneous significance testing.用于同时进行显著性检验的贝叶斯最优发现程序。

BMC Bioinformatics. 2009 Jan 6;10:5. doi: 10.1186/1471-2105-10-5.

Accurate ranking of differentially expressed genes by a distribution-free shrinkage approach.基于无分布收缩方法对差异表达基因进行准确排序。

Stat Appl Genet Mol Biol. 2007;6:Article9. doi: 10.2202/1544-6115.1252. Epub 2007 Feb 23.

Microarray analysis distinguishes differential gene expression patterns from large and small colony Thymidine kinase mutants of L5178Y mouse lymphoma cells.微阵列分析可区分 L5178Y 小鼠淋巴瘤细胞大、小菌落胸苷激酶突变体的差异基因表达模式。

BMC Bioinformatics. 2006 Sep 6;7 Suppl 2(Suppl 2):S9. doi: 10.1186/1471-2105-7-S2-S9.

Feature-level exploration of a published Affymetrix GeneChip control dataset.对已发表的Affymetrix基因芯片对照数据集进行特征级探索。

Genome Biol. 2006;7(8):404. doi: 10.1186/gb-2006-7-8-404.

The optimal discovery procedure for large-scale significance testing, with applications to comparative microarray experiments.大规模显著性检验的最优发现程序及其在比较微阵列实验中的应用

Biostatistics. 2007 Apr;8(2):414-32. doi: 10.1093/biostatistics/kxl019. Epub 2006 Aug 23.

cDNA microarrays as a tool for identification of biomineralization proteins in the coccolithophorid Emiliania huxleyi (Haptophyta).cDNA微阵列作为鉴定颗石藻（赫氏颗石藻，定鞭藻纲）中生物矿化蛋白的工具。

Appl Environ Microbiol. 2006 Aug;72(8):5512-26. doi: 10.1128/AEM.00343-06.

Proteomic analysis of shoot-borne root initiation in maize (Zea mays L.).玉米（Zea mays L.）茎生不定根起始的蛋白质组学分析。

Proteomics. 2006 Apr;6(8):2530-41. doi: 10.1002/pmic.200500564.

Serum circulating human mRNA profiling and its utility for oral cancer detection.血清循环人mRNA分析及其在口腔癌检测中的应用。

J Clin Oncol. 2006 Apr 10;24(11):1754-60. doi: 10.1200/JCO.2005.03.7598. Epub 2006 Feb 27.

Gene expression in giant cell myocarditis: Altered expression of immune response genes.巨细胞性心肌炎中的基因表达：免疫反应基因的表达改变

Int J Cardiol. 2005 Jul 10;102(2):333-40. doi: 10.1016/j.ijcard.2005.03.075.

Preferred analysis methods for Affymetrix GeneChips revealed by a wholly defined control dataset.由完全定义的对照数据集揭示的Affymetrix基因芯片的首选分析方法。

Genome Biol. 2005;6(2):R16. doi: 10.1186/gb-2005-6-2-r16. Epub 2005 Jan 28.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

微阵列分析中倍数变化和 T 检验的双重过滤的详细检查。

A close examination of double filtering with fold change and T test in microarray analysis.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献