Suppr超能文献

对Affymetrix基因表达数据中的差异表达基因进行排名:具有可重复性、敏感性和特异性的方法。

Ranking differentially expressed genes from Affymetrix gene expression data: methods with reproducibility, sensitivity, and specificity.

作者信息

Kadota Koji, Nakai Yuji, Shimizu Kentaro

机构信息

Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo-ku, Tokyo 113-8657, Japan.

出版信息

Algorithms Mol Biol. 2009 Apr 22;4:7. doi: 10.1186/1748-7188-4-7.

Abstract

BACKGROUND

To identify differentially expressed genes (DEGs) from microarray data, users of the Affymetrix GeneChip system need to select both a preprocessing algorithm to obtain expression-level measurements and a way of ranking genes to obtain the most plausible candidates. We recently recommended suitable combinations of a preprocessing algorithm and gene ranking method that can be used to identify DEGs with a higher level of sensitivity and specificity. However, in addition to these recommendations, researchers also want to know which combinations enhance reproducibility.

RESULTS

We compared eight conventional methods for ranking genes: weighted average difference (WAD), average difference (AD), fold change (FC), rank products (RP), moderated t statistic (modT), significance analysis of microarrays (samT), shrinkage t statistic (shrinkT), and intensity-based moderated t statistic (ibmT) with six preprocessing algorithms (PLIER, VSN, FARMS, multi-mgMOS (mmgMOS), MBEI, and GCRMA). A total of 36 real experimental datasets was evaluated on the basis of the area under the receiver operating characteristic curve (AUC) as a measure for both sensitivity and specificity. We found that the RP method performed well for VSN-, FARMS-, MBEI-, and GCRMA-preprocessed data, and the WAD method performed well for mmgMOS-preprocessed data. Our analysis of the MicroArray Quality Control (MAQC) project's datasets showed that the FC-based gene ranking methods (WAD, AD, FC, and RP) had a higher level of reproducibility: The percentages of overlapping genes (POGs) across different sites for the FC-based methods were higher overall than those for the t-statistic-based methods (modT, samT, shrinkT, and ibmT). In particular, POG values for WAD were the highest overall among the FC-based methods irrespective of the choice of preprocessing algorithm.

CONCLUSION

Our results demonstrate that to increase sensitivity, specificity, and reproducibility in microarray analyses, we need to select suitable combinations of preprocessing algorithms and gene ranking methods. We recommend the use of FC-based methods, in particular RP or WAD.

摘要

背景

为了从微阵列数据中识别差异表达基因(DEG),Affymetrix基因芯片系统的用户需要选择一种预处理算法来获得表达水平测量值,以及一种对基因进行排名的方法来获得最合理的候选基因。我们最近推荐了预处理算法和基因排名方法的合适组合,可用于以更高的灵敏度和特异性识别DEG。然而,除了这些建议外,研究人员还想知道哪些组合能提高可重复性。

结果

我们将八种传统的基因排名方法进行了比较:加权平均差(WAD)、平均差(AD)、倍数变化(FC)、秩乘积(RP)、适度t统计量(modT)、微阵列显著性分析(samT)、收缩t统计量(shrinkT)和基于强度的适度t统计量(ibmT),并与六种预处理算法(PLIER、VSN、FARMS、多mgMOS(mmgMOS)、MBEI和GCRMA)进行比较。基于接收器操作特征曲线(AUC)下的面积,对总共36个真实实验数据集进行了评估,以此作为灵敏度和特异性的度量。我们发现,RP方法对VSN、FARMS、MBEI和GCRMA预处理的数据表现良好,而WAD方法对mmgMOS预处理的数据表现良好。我们对微阵列质量控制(MAQC)项目数据集的分析表明,基于FC的基因排名方法(WAD、AD、FC和RP)具有更高的可重复性:基于FC的方法在不同位点的重叠基因百分比(POG)总体上高于基于t统计量的方法(modT、samT、shrinkT和ibmT)。特别是,无论选择何种预处理算法,WAD的POG值在基于FC的方法中总体上是最高的。

结论

我们的结果表明,为了提高微阵列分析的灵敏度、特异性和可重复性,我们需要选择预处理算法和基因排名方法的合适组合。我们建议使用基于FC的方法,特别是RP或WAD。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e10c/2679019/7496760c2995/1748-7188-4-7-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验