基于信息准则的聚类对短时间-course 微阵列实验中受限制的候选配置文件的响应。

A response to information criterion-based clustering with order-restricted candidate profiles in short time-course microarray experiments.

出版信息

BMC Bioinformatics. 2009 Dec 22;10:438; author reply 438. doi: 10.1186/1471-2105-10-438.

PMID:20028515

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2813245/

Abstract

BACKGROUND

For gene expression data obtained from a time-course microarray experiment, Liu et al. developed a new algorithm for clustering genes with similar expression profiles over time. Performance of their proposal was compared with three other methods including the order-restricted inference based methodology of Peddada et al. In this note we point out several inaccuracies in Liu et al. and conclude that the order-restricted inference based methodology of Peddada et al. (programmed in the software ORIOGEN) indeed operates at the desired nominal Type 1 error level, an important feature of a statistical decision rule, while being computationally substantially faster than indicated by Liu et al.

RESULTS

Application of ORIOGEN to the well-known breast cancer cell line data of Lobenhofer et al. revealed that ORIOGEN software took only 21 minutes to run (using 100,000 bootstraps with p = 0.0025), substantially faster than the 72 hours found by Liu et al. using Matlab. Also, based on a data simulated according to the model and parameters of simulation 1 (sigma2 = 1, M = 5) in [1] we found that ORIOGEN took less than 30 seconds to run in stark contrast to Liu et al. who reported that their implementation of the same algorithm in R took 2979.29 seconds. Furthermore, for the simulation studies reported in [1], unlike the claims made by Liu et al., ORIOGEN always maintained the desired false positive rate. According to Figure three in Liu et al. their algorithm had a false positive rate ranging approximately from 0.20 to 0.70 for the scenarios that they simulated.

CONCLUSIONS

Our comparisons of run times indicate that the implementations of ORIOGEN's algorithm in Matlab and R by Liu et al. is inefficient compared to the publicly available JAVA implementation. Our results on the false positive rate of ORIOGEN suggest some error in Figure three of Liu et al., perhaps due to a programming error.

摘要

背景

对于从时间序列微阵列实验获得的基因表达数据，Liu 等人开发了一种新的算法，用于对随时间具有相似表达谱的基因进行聚类。他们的提议的性能与其他三种方法进行了比较，包括 Peddada 等人基于有序限制推理的方法。在本说明中，我们指出了 Liu 等人的几个不准确之处，并得出结论，Peddada 等人基于有序限制推理的方法（在软件 ORIOGEN 中编程）确实在所需的名义第一类错误水平下运行，这是统计决策规则的一个重要特征，而计算速度比 Liu 等人指出的要快得多。

结果

将 ORIOGEN 应用于 Lobenhofer 等人著名的乳腺癌细胞系数据，结果表明 ORIOGEN 软件仅用 21 分钟（使用 100000 个 bootstrap，p = 0.0025）即可运行，比 Liu 等人使用 Matlab 发现的 72 小时快得多。此外，根据[1]中模拟 1（sigma2 = 1，M = 5）的模型和参数模拟的数据，我们发现 ORIOGEN 运行时间不到 30 秒，而 Liu 等人报告说他们在 R 中实现相同算法需要 2979.29 秒。此外，对于[1]中报告的模拟研究，与 Liu 等人的说法相反，ORIOGEN 始终保持所需的假阳性率。根据 Liu 等人的图 3，他们的算法对于他们模拟的场景，假阳性率约为 0.20 到 0.70。