Chan Oliver Yuan Wei, Keng Bryan Ming Hsun, Ling Maurice Han Tong
Raffles Institution, Republic of Singapore.
Department of Zoology, The University of Melbourne, Australia ; School of Chemical and Biomedical Engineering, Nanyang Technological University, Republic of Singapore.
Electron Physician. 2014 Feb 1;6(1):719-27. doi: 10.14661/2014.719-727. eCollection 2014 Jan-Mar.
Reference genes are assumed to be stably expressed under most circumstances. Previous studies have shown that identification of potential reference genes using common algorithms, such as NormFinder, geNorm, and BestKeeper, are not suitable for microarray-sized datasets. The aim of this study was to evaluate existing methods and develop methods for identifying reference genes from microarray datasets.
We evaluated the correlation between outputs from 7 published methods for identifying reference genes, including NormFinder, geNorm, and BestKeeper, using subsets of published microarray data. From these results, seven novel combinations of published methods for identifying reference genes were evaluated.
Our results showed that NormFinder's and geNorm's indices had high correlations (R(2) = 0.987, P < 0.0001), which is consistent with the findings of previous studies. However, NormFinder's and BestKeeper's indices (R(2) = 0.489, 0.01 < P < 0.05) and NormFinder's coefficient of variance (CV) suggested a lower correlation (R(2) = 0.483, 0.01 < P < 0.05). We developed two novel methods with high correlations with NormFinder (R(2) values of both methods were 0.796, P < 0.0001). In addition, computational times required by the two novel methods were linear with the size of the dataset.
Our findings suggested that both of our novel methods can be used as alternatives to NormFinder, geNorm, and BestKeeper for identifying reference genes from large datasets. These methods were implemented as a tool, OLIgonucleotide Variable Expression Ranker (OLIVER), which can be downloaded from http://sourceforge.net/projects/bactome/files/OLIVER/OLIVER_1.zip.
参考基因被认为在大多数情况下能够稳定表达。先前的研究表明,使用常见算法(如NormFinder、geNorm和BestKeeper)来鉴定潜在参考基因并不适用于微阵列规模的数据集。本研究的目的是评估现有方法,并开发从微阵列数据集中鉴定参考基因的方法。
我们使用已发表的微阵列数据子集,评估了7种已发表的参考基因鉴定方法(包括NormFinder、geNorm和BestKeeper)的输出结果之间的相关性。基于这些结果,对7种已发表的参考基因鉴定方法的新组合进行了评估。
我们的结果表明,NormFinder和geNorm的指数具有高度相关性(R² = 0.987,P < 0.0001),这与先前研究的结果一致。然而,NormFinder和BestKeeper的指数(R² = 0.489,0.01 < P < 0.05)以及NormFinder的变异系数(CV)显示出较低的相关性(R² = 0.483,0.01 < P < 0.05)。我们开发了两种与NormFinder具有高度相关性的新方法(两种方法的R²值均为0.796,P < 0.0001)。此外,这两种新方法所需的计算时间与数据集大小呈线性关系。
我们的研究结果表明,我们开发的两种新方法均可作为NormFinder、geNorm和BestKeeper的替代方法,用于从大型数据集中鉴定参考基因。这些方法已作为一个名为寡核苷酸可变表达排序器(OLIVER)的工具实现,可从http://sourceforge.net/projects/bactome/files/OLIVER/OLIVER_1.zip下载。