Owzar Kouros, Barry William T, Jung Sin-Ho, Sohn Insuk, George Stephen L
Department of Biostatistics and Bioinformatics, and Cancer and Leukemia Group B Statistical Center, Duke University School of Medicine, 2424 Erwin Road, Durham, NC 27705, USA.
Clin Cancer Res. 2008 Oct 1;14(19):5959-66. doi: 10.1158/1078-0432.CCR-07-4532.
Many clinical studies incorporate genomic experiments to investigate the potential associations between high-dimensional molecular data and clinical outcome. A critical first step in the statistical analyses of these experiments is that the molecular data are preprocessed. This article provides an overview of preprocessing methods, including summary algorithms and quality control metrics for microarrays. Some of the ramifications and effects that preprocessing methods have on the statistical results are illustrated. The discussions are centered around a microarray experiment based on lung cancer tumor samples with survival as the clinical outcome of interest. The procedures that are presented focus on the array platform used in this study. However, many of these issues are more general and are applicable to other instruments for genome-wide investigation. The discussions here will provide insight into the statistical challenges in preprocessing microarrays used in clinical studies of cancer. These challenges should not be viewed as inconsequential nuisances but rather as important issues that need to be addressed so that informed conclusions can be drawn.
许多临床研究纳入基因组实验,以调查高维分子数据与临床结果之间的潜在关联。这些实验统计分析的关键第一步是对分子数据进行预处理。本文概述了预处理方法,包括微阵列的汇总算法和质量控制指标。阐述了预处理方法对统计结果的一些影响。讨论围绕一项基于肺癌肿瘤样本的微阵列实验展开,该实验将生存作为感兴趣的临床结果。所介绍的程序侧重于本研究中使用的阵列平台。然而,其中许多问题更为普遍,适用于其他全基因组研究仪器。这里的讨论将深入了解癌症临床研究中微阵列预处理的统计挑战。这些挑战不应被视为无关紧要的麻烦,而应被视为需要解决的重要问题,以便得出明智的结论。