Hirakawa Akihiro, Sato Yasunori, Sozu Takashi, Hamada Chikuma, Yoshimura Isao
Genetics Division, National Cancer Center Research Institute, Chuo-ku, Tokyo, Japan.
Cancer Inform. 2008 Jan 22;3:140-8.
The recent development of DNA microarray technology allows us to measure simultaneously the expression levels of thousands of genes and to identify truly correlated genes with anticancer drug response (differentially expressed genes) from many candidate genes. Significance Analysis of Microarray (SAM) is often used to estimate the false discovery rate (FDR), which is an index for optimizing the identifiability of differentially expressed genes, while the accuracy of the estimated FDR by SAM is not necessarily confirmed. We propose a new method for estimating the FDR assuming a mixed normal distribution on the test statistic and examine the performance of the proposed method and SAM using simulated data. The simulation results indicate that the accuracy of the estimated FDR by the proposed method and SAM, varied depending on the experimental conditions. We applied both methods to actual data comprised of expression levels of 12,625 genes of 10 responders and 14 non-responders to docetaxel for breast cancer. The proposed method identified 280 differentially expressed genes correlated with docetaxel response using a cut-off value for achieving FDR <0.01 to prevent false-positive genes, although 92 genes were previously thought to be correlated with docetaxel response ones.
DNA微阵列技术的最新发展使我们能够同时测量数千个基因的表达水平,并从众多候选基因中识别出与抗癌药物反应真正相关的基因(差异表达基因)。微阵列显著性分析(SAM)常被用于估计错误发现率(FDR),它是优化差异表达基因可识别性的一个指标,然而SAM估计FDR的准确性不一定得到确认。我们提出了一种在检验统计量上假设混合正态分布来估计FDR的新方法,并使用模拟数据检验了该方法和SAM的性能。模拟结果表明,所提方法和SAM估计FDR的准确性因实验条件而异。我们将这两种方法应用于由10名对多西他赛有反应和14名对多西他赛无反应的乳腺癌患者的12625个基因表达水平组成的实际数据。所提方法使用一个截止值来实现FDR<0.01以防止假阳性基因,从而识别出280个与多西他赛反应相关的差异表达基因,尽管之前认为有92个基因与多西他赛反应相关。