• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于从微阵列数据中识别跨多种条件下差异表达基因的方法比较。

Comparison of methods for identifying differentially expressed genes across multiple conditions from microarray data.

作者信息

Tan Yuande, Liu Yin

出版信息

Bioinformation. 2011;7(8):400-4. doi: 10.6026/97320630007400. Epub 2011 Dec 21.

DOI:10.6026/97320630007400
PMID:22347782
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3280440/
Abstract

Identification of genes differentially expressed across multiple conditions has become an important statistical problem in analyzing large-scale microarray data. Many statistical methods have been developed to address the challenging problem. Therefore, an extensive comparison among these statistical methods is extremely important for experimental scientists to choose a valid method for their data analysis. In this study, we conducted simulation studies to compare six statistical methods: the Bonferroni (B-) procedure, the Benjamini and Hochberg (BH-) procedure, the Local false discovery rate (Localfdr) method, the Optimal Discovery Procedure (ODP), the Ranking Analysis of F-statistics (RAF), and the Significant Analysis of Microarray data (SAM) in identifying differentially expressed genes. We demonstrated that the strength of treatment effect, the sample size, proportion of differentially expressed genes and variance of gene expression will significantly affect the performance of different methods. The simulated results show that ODP exhibits an extremely high power in indentifying differentially expressed genes, but significantly underestimates the False Discovery Rate (FDR) in all different data scenarios. The SAM has poor performance when the sample size is small, but is among the best-performing methods when the sample size is large. The B-procedure is stringent and thus has a low power in all data scenarios. Localfdr and RAF show comparable statistical behaviors with the BH-procedure with favorable power and conservativeness of FDR estimation. RAF performs the best when proportion of differentially expressed genes is small and treatment effect is weak, but Localfdr is better than RAF when proportion of differentially expressed genes is large.

摘要

识别在多种条件下差异表达的基因已成为分析大规模微阵列数据时一个重要的统计学问题。人们已开发出许多统计方法来解决这一具有挑战性的问题。因此,对这些统计方法进行广泛比较对于实验科学家选择有效的数据分析方法极为重要。在本研究中,我们进行了模拟研究,以比较六种统计方法:邦费罗尼(B-)程序、本雅明尼和霍奇伯格(BH-)程序、局部错误发现率(Localfdr)方法、最优发现程序(ODP)、F统计量的排序分析(RAF)以及微阵列数据的显著性分析(SAM)在识别差异表达基因方面的表现。我们证明了处理效应的强度、样本量、差异表达基因的比例以及基因表达的方差会显著影响不同方法的性能。模拟结果表明,ODP在识别差异表达基因方面具有极高的功效,但在所有不同的数据情形下都显著低估了错误发现率(FDR)。当样本量较小时,SAM的表现较差,但在样本量较大时是表现最佳的方法之一。B-程序较为严格,因此在所有数据情形下功效都较低。Localfdr和RAF与BH-程序表现出可比的统计行为,具有良好的功效和FDR估计的保守性。当差异表达基因的比例较小时且处理效应较弱时,RAF表现最佳,但当差异表达基因的比例较大时,Localfdr比RAF更好。

相似文献

1
Comparison of methods for identifying differentially expressed genes across multiple conditions from microarray data.用于从微阵列数据中识别跨多种条件下差异表达基因的方法比较。
Bioinformation. 2011;7(8):400-4. doi: 10.6026/97320630007400. Epub 2011 Dec 21.
2
Work efficiency: a new criterion for comprehensive comparison and evaluation of statistical methods in large-scale identification of differentially expressed genes.工作效率:大规模差异表达基因识别中统计方法综合比较与评价的新准则。
Genomics. 2011 Nov;98(5):390-9. doi: 10.1016/j.ygeno.2011.05.006. Epub 2011 Jun 30.
3
A general method for accurate estimation of false discovery rates in identification of differentially expressed genes.一种用于准确估计差异表达基因识别中错误发现率的通用方法。
Bioinformatics. 2014 Jul 15;30(14):2018-25. doi: 10.1093/bioinformatics/btu124. Epub 2014 Mar 14.
4
Filtering for increased power for microarray data analysis.为提高微阵列数据分析的功效进行筛选。
BMC Bioinformatics. 2009 Jan 8;10:11. doi: 10.1186/1471-2105-10-11.
5
Identifying differentially expressed genes using false discovery rate controlling procedures.使用错误发现率控制程序识别差异表达基因。
Bioinformatics. 2003 Feb 12;19(3):368-75. doi: 10.1093/bioinformatics/btf877.
6
Detecting differential expression in microarray data: comparison of optimal procedures.检测微阵列数据中的差异表达:最优方法比较
BMC Bioinformatics. 2007 Jan 26;8:28. doi: 10.1186/1471-2105-8-28.
7
An investigation on performance of Significance Analysis of Microarray (SAM) for the comparisons of several treatments with one control in the presence of small-variance genes.在存在小方差基因的情况下,对微阵列显著性分析(SAM)用于几种处理与一个对照进行比较的性能研究。
Biom J. 2008 Oct;50(5):801-23. doi: 10.1002/bimj.200710467.
8
Estimating the false discovery rate using mixed normal distribution for identifying differentially expressed genes in microarray data analysis.在微阵列数据分析中使用混合正态分布估计错误发现率以识别差异表达基因。
Cancer Inform. 2008 Jan 22;3:140-8.
9
Ranking analysis of F-statistics for microarray data.微阵列数据的F统计量排名分析。
BMC Bioinformatics. 2008 Mar 6;9:142. doi: 10.1186/1471-2105-9-142.
10
A new test statistic based on shrunken sample variance for identifying differentially expressed genes in small microarray experiments.一种基于收缩样本方差的新检验统计量,用于在小型微阵列实验中识别差异表达基因。
Bioinform Biol Insights. 2008 Feb 29;2:145-56. doi: 10.4137/bbi.s473.

引用本文的文献

1
Early response index: a statistic to discover potential early stage disease biomarkers.早期反应指数:一种用于发现潜在早期疾病生物标志物的统计量。
BMC Bioinformatics. 2017 Jun 23;18(1):313. doi: 10.1186/s12859-017-1712-y.
2
Hybrid-controlled neurofuzzy networks analysis resulting in genetic regulatory networks reconstruction.混合控制神经模糊网络分析用于基因调控网络重建。
ISRN Bioinform. 2012 Nov 1;2012:419419. doi: 10.5402/2012/419419. eCollection 2012.
3
A regression-based differential expression detection algorithm for microarray studies with ultra-low sample size.

本文引用的文献

1
A comprehensive re-analysis of the Golden Spike data: towards a benchmark for differential expression methods.对金标准数据的全面重新分析:迈向差异表达方法的基准
BMC Bioinformatics. 2008 Mar 26;9:164. doi: 10.1186/1471-2105-9-164.
2
Empirical Bayes models for multiple probe type microarrays at the probe level.探针水平上多探针类型微阵列的经验贝叶斯模型。
BMC Bioinformatics. 2008 Mar 20;9:156. doi: 10.1186/1471-2105-9-156.
3
Ranking analysis of F-statistics for microarray data.微阵列数据的F统计量排名分析。
一种用于超低样本量微阵列研究的基于回归的差异表达检测算法。
PLoS One. 2015 Mar 4;10(3):e0118198. doi: 10.1371/journal.pone.0118198. eCollection 2015.
BMC Bioinformatics. 2008 Mar 6;9:142. doi: 10.1186/1471-2105-9-142.
4
A simple implementation of a normal mixture approach to differential gene expression in multiclass microarrays.一种用于多类微阵列中差异基因表达的正态混合方法的简单实现。
Bioinformatics. 2006 Jul 1;22(13):1608-15. doi: 10.1093/bioinformatics/btl148. Epub 2006 Apr 21.
5
A reanalysis of a published Affymetrix GeneChip control dataset.对已发表的Affymetrix基因芯片对照数据集的重新分析。
Genome Biol. 2006;7(3):401. doi: 10.1186/gb-2006-7-3-401. Epub 2006 Mar 22.
6
Preferred analysis methods for Affymetrix GeneChips revealed by a wholly defined control dataset.由完全定义的对照数据集揭示的Affymetrix基因芯片的首选分析方法。
Genome Biol. 2005;6(2):R16. doi: 10.1186/gb-2005-6-2-r16. Epub 2005 Jan 28.
7
Gene expression profiling and functional proteomic analysis reveal perturbed kinase-mediated signaling in genetic stroke susceptibility.基因表达谱分析和功能蛋白质组学分析揭示了遗传性中风易感性中激酶介导的信号传导紊乱。
Physiol Genomics. 2003 Sep 29;15(1):75-83. doi: 10.1152/physiolgenomics.00020.2003.
8
Statistical methods for ranking differentially expressed genes.对差异表达基因进行排名的统计方法。
Genome Biol. 2003;4(6):R41. doi: 10.1186/gb-2003-4-6-r41. Epub 2003 May 29.