Key Laboratory for Symbol Computation and Knowledge Engineering of National Education Ministry, College of Computer Science and Technology, Jilin University, Changchun, China.
PLoS One. 2012;7(1):e29860. doi: 10.1371/journal.pone.0029860. Epub 2012 Jan 20.
In previous work, we proposed a method for detecting differential gene expression based on change-point of expression profile. This non-parametric change-point method gave promising result in both simulation study and public dataset experiment. However, the performance is still limited by the less sensitiveness to the right bound and the statistical significance of the statistics has not been fully explored. To overcome the insensitiveness to the right bound we modified the original method by adding a weight function to the D(n) statistic. Simulation study showed that the weighted change-point statistics method is significantly better than the original NPCPS in terms of ROC, false positive rate, as well as change-point estimate. The mean absolute error of the estimated change-point by weighted change-point method was 0.03, reduced by more than 50% comparing with the original 0.06, and the mean FPR was reduced by more than 55%. Experiment on microarray Dataset I resulted in 3974 differentially expressed genes out of total 5293 genes; experiment on microarray Dataset II resulted in 9983 differentially expressed genes among total 12576 genes. In summary, the method proposed here is an effective modification to the previous method especially when only a small subset of cancer samples has DGE.
在之前的工作中,我们提出了一种基于表达谱变化点的差异基因表达检测方法。这种非参数变化点方法在模拟研究和公共数据集实验中都取得了有希望的结果。然而,其性能仍然受到右边界灵敏度的限制,并且统计量的统计显著性尚未得到充分探索。为了克服对右边界的不敏感,我们通过向 D(n)统计量添加权重函数来修改原始方法。模拟研究表明,加权变化点统计方法在 ROC、假阳性率以及变化点估计方面明显优于原始 NPCPS。加权变化点方法估计的变化点的平均绝对误差为 0.03,与原始方法的 0.06 相比降低了 50%以上,平均 FPR 降低了 55%以上。在微阵列数据集 I 上的实验得到了 5293 个基因中 3974 个差异表达基因;在微阵列数据集 II 上的实验得到了 12576 个基因中 9983 个差异表达基因。总之,这里提出的方法是对以前方法的有效改进,特别是当只有一小部分癌症样本具有 DGE 时。