BMC Bioinformatics. 2009 Dec 22;10:438; author reply 438. doi: 10.1186/1471-2105-10-438.
For gene expression data obtained from a time-course microarray experiment, Liu et al. developed a new algorithm for clustering genes with similar expression profiles over time. Performance of their proposal was compared with three other methods including the order-restricted inference based methodology of Peddada et al. In this note we point out several inaccuracies in Liu et al. and conclude that the order-restricted inference based methodology of Peddada et al. (programmed in the software ORIOGEN) indeed operates at the desired nominal Type 1 error level, an important feature of a statistical decision rule, while being computationally substantially faster than indicated by Liu et al.
Application of ORIOGEN to the well-known breast cancer cell line data of Lobenhofer et al. revealed that ORIOGEN software took only 21 minutes to run (using 100,000 bootstraps with p = 0.0025), substantially faster than the 72 hours found by Liu et al. using Matlab. Also, based on a data simulated according to the model and parameters of simulation 1 (sigma2 = 1, M = 5) in [1] we found that ORIOGEN took less than 30 seconds to run in stark contrast to Liu et al. who reported that their implementation of the same algorithm in R took 2979.29 seconds. Furthermore, for the simulation studies reported in [1], unlike the claims made by Liu et al., ORIOGEN always maintained the desired false positive rate. According to Figure three in Liu et al. their algorithm had a false positive rate ranging approximately from 0.20 to 0.70 for the scenarios that they simulated.
Our comparisons of run times indicate that the implementations of ORIOGEN's algorithm in Matlab and R by Liu et al. is inefficient compared to the publicly available JAVA implementation. Our results on the false positive rate of ORIOGEN suggest some error in Figure three of Liu et al., perhaps due to a programming error.
对于从时间序列微阵列实验获得的基因表达数据,Liu 等人开发了一种新的算法,用于对随时间具有相似表达谱的基因进行聚类。他们的提议的性能与其他三种方法进行了比较,包括 Peddada 等人基于有序限制推理的方法。在本说明中,我们指出了 Liu 等人的几个不准确之处,并得出结论,Peddada 等人基于有序限制推理的方法(在软件 ORIOGEN 中编程)确实在所需的名义第一类错误水平下运行,这是统计决策规则的一个重要特征,而计算速度比 Liu 等人指出的要快得多。
将 ORIOGEN 应用于 Lobenhofer 等人著名的乳腺癌细胞系数据,结果表明 ORIOGEN 软件仅用 21 分钟(使用 100000 个 bootstrap,p = 0.0025)即可运行,比 Liu 等人使用 Matlab 发现的 72 小时快得多。此外,根据[1]中模拟 1(sigma2 = 1,M = 5)的模型和参数模拟的数据,我们发现 ORIOGEN 运行时间不到 30 秒,而 Liu 等人报告说他们在 R 中实现相同算法需要 2979.29 秒。此外,对于[1]中报告的模拟研究,与 Liu 等人的说法相反,ORIOGEN 始终保持所需的假阳性率。根据 Liu 等人的图 3,他们的算法对于他们模拟的场景,假阳性率约为 0.20 到 0.70。
我们对运行时间的比较表明,与公开的 Java 实现相比,Liu 等人在 Matlab 和 R 中实现的 ORIOGEN 算法效率低下。我们对 ORIOGEN 假阳性率的结果表明,Liu 等人的图 3 中可能存在一些错误,可能是由于编程错误。