Norris Andrew W, Kahn C Ronald
Joslin Diabetes Center, Children's Hospital, and Harvard Medical School, Boston, MA 02215, USA.
Proc Natl Acad Sci U S A. 2006 Jan 17;103(3):649-53. doi: 10.1073/pnas.0510115103. Epub 2006 Jan 9.
Nucleotide-microarray technology, which allows the simultaneous measurement of the expression of tens of thousands of genes, has become an important tool in the study of disease. In disorders such as malignancy, gene expression often undergoes broad changes of sizable magnitude, whereas in many common multifactorial diseases, such as diabetes, obesity, and atherosclerosis, the changes in gene expression are modest. In the latter circumstance, it is therefore challenging to distinguish the truly changing from non-changing genes, especially because statistical significance must be considered in the context of multiple hypothesis testing. Here, we present a balanced probability analysis (BPA), which provides the biologist with an approach to interpret results in the context of the total number of genes truly differentially expressed and false discovery and false negative rates for the list of genes reaching any significance threshold. In situations where the changes are of modest magnitude, sole consideration of the false discovery rate can result in poor power to detect genes truly differentially expressed. Concomitant analysis of the rate of truly differentially expressed genes not identified, i.e., the false negative rate, allows balancing of the two error rates and a more thorough insight into the data. To this end, we have developed a unique, model-based procedure for the estimation of false negative rates, which allows application of BPA to real data in which changes are modest.
核苷酸微阵列技术能够同时测量数万个基因的表达情况,已成为疾病研究中的一项重要工具。在诸如恶性肿瘤等疾病中,基因表达常常会发生幅度较大的广泛变化,而在许多常见的多因素疾病,如糖尿病、肥胖症和动脉粥样硬化中,基因表达的变化则较为微小。在后一种情况下,区分真正发生变化的基因和未发生变化的基因具有挑战性,特别是因为在多重假设检验的背景下必须考虑统计显著性。在此,我们提出了一种平衡概率分析(BPA)方法,该方法为生物学家提供了一种途径,可根据真正差异表达的基因总数以及达到任何显著性阈值的基因列表的错误发现率和假阴性率来解释结果。在变化幅度较小的情况下,仅考虑错误发现率可能导致检测真正差异表达基因的能力不足。同时分析未被识别的真正差异表达基因的比率,即假阴性率,能够平衡这两种错误率,并更深入地洞察数据。为此,我们开发了一种独特的基于模型的程序来估计假阴性率,这使得BPA能够应用于变化微小的实际数据。