对Affymetrix基因表达数据中的差异表达基因进行排名：具有可重复性、敏感性和特异性的方法。

Ranking differentially expressed genes from Affymetrix gene expression data: methods with reproducibility, sensitivity, and specificity.

作者信息

Kadota Koji, Nakai Yuji, Shimizu Kentaro

机构信息

Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo-ku, Tokyo 113-8657, Japan.

出版信息

Algorithms Mol Biol. 2009 Apr 22;4:7. doi: 10.1186/1748-7188-4-7.

DOI:10.1186/1748-7188-4-7

PMID:19386098

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2679019/

Abstract

BACKGROUND

To identify differentially expressed genes (DEGs) from microarray data, users of the Affymetrix GeneChip system need to select both a preprocessing algorithm to obtain expression-level measurements and a way of ranking genes to obtain the most plausible candidates. We recently recommended suitable combinations of a preprocessing algorithm and gene ranking method that can be used to identify DEGs with a higher level of sensitivity and specificity. However, in addition to these recommendations, researchers also want to know which combinations enhance reproducibility.

RESULTS

We compared eight conventional methods for ranking genes: weighted average difference (WAD), average difference (AD), fold change (FC), rank products (RP), moderated t statistic (modT), significance analysis of microarrays (samT), shrinkage t statistic (shrinkT), and intensity-based moderated t statistic (ibmT) with six preprocessing algorithms (PLIER, VSN, FARMS, multi-mgMOS (mmgMOS), MBEI, and GCRMA). A total of 36 real experimental datasets was evaluated on the basis of the area under the receiver operating characteristic curve (AUC) as a measure for both sensitivity and specificity. We found that the RP method performed well for VSN-, FARMS-, MBEI-, and GCRMA-preprocessed data, and the WAD method performed well for mmgMOS-preprocessed data. Our analysis of the MicroArray Quality Control (MAQC) project's datasets showed that the FC-based gene ranking methods (WAD, AD, FC, and RP) had a higher level of reproducibility: The percentages of overlapping genes (POGs) across different sites for the FC-based methods were higher overall than those for the t-statistic-based methods (modT, samT, shrinkT, and ibmT). In particular, POG values for WAD were the highest overall among the FC-based methods irrespective of the choice of preprocessing algorithm.

CONCLUSION

Our results demonstrate that to increase sensitivity, specificity, and reproducibility in microarray analyses, we need to select suitable combinations of preprocessing algorithms and gene ranking methods. We recommend the use of FC-based methods, in particular RP or WAD.

摘要

背景

为了从微阵列数据中识别差异表达基因（DEG），Affymetrix基因芯片系统的用户需要选择一种预处理算法来获得表达水平测量值，以及一种对基因进行排名的方法来获得最合理的候选基因。我们最近推荐了预处理算法和基因排名方法的合适组合，可用于以更高的灵敏度和特异性识别DEG。然而，除了这些建议外，研究人员还想知道哪些组合能提高可重复性。

结果

我们将八种传统的基因排名方法进行了比较：加权平均差（WAD）、平均差（AD）、倍数变化（FC）、秩乘积（RP）、适度t统计量（modT）、微阵列显著性分析（samT）、收缩t统计量（shrinkT）和基于强度的适度t统计量（ibmT），并与六种预处理算法（PLIER、VSN、FARMS、多mgMOS（mmgMOS）、MBEI和GCRMA）进行比较。基于接收器操作特征曲线（AUC）下的面积，对总共36个真实实验数据集进行了评估，以此作为灵敏度和特异性的度量。我们发现，RP方法对VSN、FARMS、MBEI和GCRMA预处理的数据表现良好，而WAD方法对mmgMOS预处理的数据表现良好。我们对微阵列质量控制（MAQC）项目数据集的分析表明，基于FC的基因排名方法（WAD、AD、FC和RP）具有更高的可重复性：基于FC的方法在不同位点的重叠基因百分比（POG）总体上高于基于t统计量的方法（modT、samT、shrinkT和ibmT）。特别是，无论选择何种预处理算法，WAD的POG值在基于FC的方法中总体上是最高的。

结论

我们的结果表明，为了提高微阵列分析的灵敏度、特异性和可重复性，我们需要选择预处理算法和基因排名方法的合适组合。我们建议使用基于FC的方法，特别是RP或WAD。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e10c/2679019/7496760c2995/1748-7188-4-7-1.jpg

相似文献

Ranking differentially expressed genes from Affymetrix gene expression data: methods with reproducibility, sensitivity, and specificity.

Algorithms Mol Biol. 2009 Apr 22;4:7. doi: 10.1186/1748-7188-4-7.

A weighted average difference method for detecting differentially expressed genes from microarray data.

Algorithms Mol Biol. 2008 Jun 26;3:8. doi: 10.1186/1748-7188-3-8.

Evaluating methods for ranking differentially expressed genes applied to microArray quality control data.

BMC Bioinformatics. 2011 Jun 6;12:227. doi: 10.1186/1471-2105-12-227.

Arrow plot: a new graphical tool for selecting up and down regulated genes and genes differentially expressed on sample subgroups.

BMC Bioinformatics. 2012 Jun 26;13:147. doi: 10.1186/1471-2105-13-147.

The balance of reproducibility, sensitivity, and specificity of lists of differentially expressed genes in microarray studies.

BMC Bioinformatics. 2008 Aug 12;9 Suppl 9(Suppl 9):S10. doi: 10.1186/1471-2105-9-S9-S10.

Investigation of reproducibility of differentially expressed genes in DNA microarrays through statistical simulation.

BMC Proc. 2009 Mar 10;3 Suppl 2(Suppl 2):S4. doi: 10.1186/1753-6561-3-s2-s4.

Probability fold change: a robust computational approach for identifying differentially expressed gene lists.

Comput Methods Programs Biomed. 2009 Feb;93(2):124-39. doi: 10.1016/j.cmpb.2008.07.013. Epub 2008 Oct 7.

Investigating the concordance of Gene Ontology terms reveals the intra- and inter-platform reproducibility of enrichment analysis.

BMC Bioinformatics. 2013 Apr 29;14:143. doi: 10.1186/1471-2105-14-143.

A Robust Approach for Identification of Cancer Biomarkers and Candidate Drugs.

Medicina (Kaunas). 2019 Jun 11;55(6):269. doi: 10.3390/medicina55060269.

Probe set filtering increases correlation between Affymetrix GeneChip and qRT-PCR expression measurements.

BMC Bioinformatics. 2010 Feb 24;11:104. doi: 10.1186/1471-2105-11-104.

引用本文的文献

Comparative Coexpression Analysis of Indole Synthase and Tryptophan Synthase A Reveals the Independent Production of Auxin via the Cytosolic Free Indole.

Plants (Basel). 2023 Apr 18;12(8):1687. doi: 10.3390/plants12081687.

Differential expression analysis using a model-based gene clustering algorithm for RNA-seq data.

BMC Bioinformatics. 2021 Oct 20;22(1):511. doi: 10.1186/s12859-021-04438-4.

Transcriptomic Changes Associated with Loss of Cell Viability Induced by Oxysterol Treatment of a Retinal Photoreceptor-Derived Cell Line: An In Vitro Model of Smith-Lemli-Opitz Syndrome.

Int J Mol Sci. 2021 Feb 26;22(5):2339. doi: 10.3390/ijms22052339.

Modified Significance Analysis of Microarrays in Heterogeneous Diseases.

J Pers Med. 2021 Jan 20;11(2):62. doi: 10.3390/jpm11020062.

Enhancing reproducibility of gene expression analysis with known protein functional relationships: The concept of well-associated protein.

PLoS Comput Biol. 2020 Feb 14;16(2):e1007684. doi: 10.1371/journal.pcbi.1007684. eCollection 2020 Feb.

The Gα12/13-coupled receptor LPA4 limits proper adipose tissue expansion and remodeling in diet-induced obesity.

JCI Insight. 2018 Dec 20;3(24):97293. doi: 10.1172/jci.insight.97293.

Silhouette Scores for Arbitrary Defined Groups in Gene Expression Data and Insights into Differential Expression Results.

Biol Proced Online. 2018 Mar 1;20:5. doi: 10.1186/s12575-018-0067-8. eCollection 2018.

ATTED-II in 2018: A Plant Coexpression Database Based on Investigation of the Statistical Property of the Mutual Rank Index.

Plant Cell Physiol. 2018 Jan 1;59(1):e3. doi: 10.1093/pcp/pcx191.

Concordance analysis of microarray studies identifies representative gene expression changes in Parkinson's disease: a comparison of 33 human and animal studies.

BMC Neurol. 2017 Mar 23;17(1):58. doi: 10.1186/s12883-017-0838-x.

Identification of BCL11B as a regulator of adipogenesis.

Sci Rep. 2016 Sep 2;6:32750. doi: 10.1038/srep32750.

本文引用的文献

The balance of reproducibility, sensitivity, and specificity of lists of differentially expressed genes in microarray studies.

BMC Bioinformatics. 2008 Aug 12;9 Suppl 9(Suppl 9):S10. doi: 10.1186/1471-2105-9-S9-S10.

"Hook"-calibration of GeneChip-microarrays: theory and algorithm.

Algorithms Mol Biol. 2008 Aug 29;3:12. doi: 10.1186/1748-7188-3-12.

"Hook"-calibration of GeneChip-microarrays: chip characteristics and expression measures.

Algorithms Mol Biol. 2008 Aug 29;3:11. doi: 10.1186/1748-7188-3-11.

A weighted average difference method for detecting differentially expressed genes from microarray data.

Algorithms Mol Biol. 2008 Jun 26;3:8. doi: 10.1186/1748-7188-3-8.

A probe-treatment-reference (PTR) model for the analysis of oligonucleotide expression microarrays.

BMC Bioinformatics. 2008 Apr 14;9:194. doi: 10.1186/1471-2105-9-194.

A comprehensive re-analysis of the Golden Spike data: towards a benchmark for differential expression methods.

BMC Bioinformatics. 2008 Mar 26;9:164. doi: 10.1186/1471-2105-9-164.

Robustified MANOVA with applications in detecting differentially expressed genes from oligonucleotide arrays.

Bioinformatics. 2008 Apr 15;24(8):1056-62. doi: 10.1093/bioinformatics/btn053. Epub 2008 Mar 3.

Up-regulation of genes related to the ubiquitin-proteasome system in the brown adipose tissue of 24-h-fasted rats.

Biosci Biotechnol Biochem. 2008 Jan;72(1):139-48. doi: 10.1271/bbb.70508. Epub 2008 Jan 7.

GOGOT: a method for the identification of differentially expressed fragments from cDNA-AFLP data.

Algorithms Mol Biol. 2007 May 30;2:5. doi: 10.1186/1748-7188-2-5.

Accurate ranking of differentially expressed genes by a distribution-free shrinkage approach.

Stat Appl Genet Mol Biol. 2007;6:Article9. doi: 10.2202/1544-6115.1252. Epub 2007 Feb 23.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

对Affymetrix基因表达数据中的差异表达基因进行排名：具有可重复性、敏感性和特异性的方法。

Ranking differentially expressed genes from Affymetrix gene expression data: methods with reproducibility, sensitivity, and specificity.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献