Khodakarim Soheila, Tabatabaei Seyyed Mohammad, AlaviMajd Hamid
Faculty of Public Health, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
Faculty of Paramedical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
Gene. 2014 Nov 15;552(1):18-23. doi: 10.1016/j.gene.2014.09.007. Epub 2014 Sep 4.
Gene Set Analysis (GSA) identifies differential expression gene sets amid the different phenotypes. The results of published papers in this filed are inconsistent and there is no consensus on the best method. In this paper two new methods, in comparison to the previous ones, are introduced for GSA.
The MMGSA and MRGSA methods based on multivariate nonparametric techniques were presented. The implementation of five GSA methods (Hotelling's T(2), Globaltest, Abs_Cat, Med_Cat and Rs_Cat) and the novel methods to detect differential gene expression between phenotypes were compared using simulated and real microarray data sets.
In a real dataset, the results showed that the powers of MMGSA and MRGSA were as well as Globaltest and Tsai. The MRGSA method has not a good performance in the simulation dataset.
The Globaltest method is the best method in the real or simulation datasets. The performance of MMGSA in simulation dataset is good in small-size gene sets. The GLS methods are not good in the simulated data, except the Med_Cat method in large-size gene sets.
基因集分析(GSA)可识别不同表型之间的差异表达基因集。该领域已发表论文的结果并不一致,对于最佳方法也未达成共识。本文针对GSA引入了两种与先前方法相比的新方法。
提出了基于多变量非参数技术的MMGSA和MRGSA方法。使用模拟和真实微阵列数据集比较了五种GSA方法(霍特林T²检验、全局检验、绝对分类、中位数分类和Rs分类)以及用于检测表型之间差异基因表达的新方法的实施情况。
在一个真实数据集中,结果表明MMGSA和MRGSA的效能与全局检验和蔡氏检验相当。MRGSA方法在模拟数据集中表现不佳。
全局检验方法在真实或模拟数据集中是最佳方法。MMGSA在模拟数据集中对于小尺寸基因集表现良好。除了大尺寸基因集中的中位数分类方法外,广义线性模型方法在模拟数据中表现不佳。