Suppr超能文献

基于最小散度法的芯片数据稳健显著性分析

Robust Significance Analysis of Microarrays by Minimum -Divergence Method.

作者信息

Shahjaman Md, Kumar Nishith, Mollah Md Manir Hossain, Ahmed Md Shakil, Ara Begum Anjuman, Shahinul Islam S M, Mollah Md Nurul Haque

机构信息

Bioinformatics Lab, Department of Statistics, University of Rajshahi, Rajshahi 6205, Bangladesh.

Department of Statistics, Begum Rokeya University, Rangpur, Rangpur 5400, Bangladesh.

出版信息

Biomed Res Int. 2017;2017:5310198. doi: 10.1155/2017/5310198. Epub 2017 Jul 27.

Abstract

Identification of differentially expressed (DE) genes with two or more conditions is an important task for discovery of few biomarker genes. Significance Analysis of Microarrays (SAM) is a popular statistical approach for identification of DE genes for both small- and large-sample cases. However, it is sensitive to outlying gene expressions and produces low power in presence of outliers. Therefore, in this paper, an attempt is made to robustify the SAM approach using the minimum -divergence estimators instead of the maximum likelihood estimators of the parameters. We demonstrated the performance of the proposed method in a comparison of some other popular statistical methods such as ANOVA, SAM, LIMMA, KW, EBarrays, GaGa, and BRIDGE using both simulated and real gene expression datasets. We observe that all methods show good and almost equal performance in absence of outliers for the large-sample cases, while in the small-sample cases only three methods (SAM, LIMMA, and proposed) show almost equal and better performance than others with two or more conditions. However, in the presence of outliers, on an average, only the proposed method performs better than others for both small- and large-sample cases with each condition.

摘要

识别在两种或更多条件下差异表达(DE)的基因是发现少数生物标志物基因的一项重要任务。微阵列显著性分析(SAM)是一种用于识别小样本和大样本情况下DE基因的常用统计方法。然而,它对异常基因表达敏感,在存在异常值的情况下功效较低。因此,本文尝试使用参数的最小散度估计量而非最大似然估计量来增强SAM方法。我们在使用模拟和真实基因表达数据集对一些其他常用统计方法(如方差分析、SAM、线性模型微阵列数据差异分析(LIMMA)、Kruskal-Wallis检验(KW)、EBarrays、GaGa和BRIDGE)的比较中展示了所提出方法的性能。我们观察到,在大样本情况下不存在异常值时,所有方法都表现良好且性能几乎相当,而在小样本情况下,只有三种方法(SAM、LIMMA和所提出的方法)在两种或更多条件下表现出几乎相当且优于其他方法的性能。然而,在存在异常值的情况下,平均而言,对于每种条件下的小样本和大样本情况,只有所提出的方法比其他方法表现更好。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6816/5551475/811a7a5d35a3/BMRI2017-5310198.001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验