Gibson Jane, Tapper William, Cox David, Zhang Weihua, Pfeufer Arne, Gieger Christian, Wichmann H-Erich, Kääb Stefan, Collins Andrew R, Meitinger Thomas, Morton Newton
Human Genetics Division, School of Medicine, University of Southampton, Southampton SO16 6YD, United Kingdom.
Proc Natl Acad Sci U S A. 2008 Feb 19;105(7):2592-7. doi: 10.1073/pnas.0711903105. Epub 2008 Feb 11.
Two case/control studies with different phenotypes, marker densities, and microarrays were examined for the most significant single markers in defined regions. They show a pronounced bias toward exaggerated significance that increases with the number of observed markers and would increase further with imputed markers. This bias is eliminated by Bonferroni adjustment, thereby allowing combination by principal component analysis with a Malecot model composite likelihood evaluated by a permutation procedure to allow for multiple dependent markers. This intermediate value identifies the only demonstrated causal locus as most significant even in the preliminary analysis and clearly recognizes the strongest candidate in the other sample. Because the three metrics (most significant single marker, composite likelihood, and their principal component) are correlated, choice of the n smallest P values by each test gives <3n regions for follow-up in the next stage. In this way, methods with different response to marker selection and density are given approximately equal weight and economically compared, without expressing an untested prejudice or sacrificing the most significant results for any of them. Large numbers of cases, controls, and markers are by themselves insufficient to control type 1 and 2 errors, and so efficient use of multiple metrics with Bonferroni adjustment promises to be valuable in identifying causal variants and optimal design simultaneously.
对两项具有不同表型、标记密度和微阵列的病例/对照研究进行了检查,以确定特定区域中最显著的单个标记。它们显示出明显的偏向,即显著性被夸大,且随着观察到的标记数量增加而增加,若使用推算标记则会进一步增加。通过Bonferroni校正可消除这种偏差,从而允许通过主成分分析与通过置换程序评估的Malecot模型复合似然性进行合并,以考虑多个相关标记。这个中间值即使在初步分析中也将唯一已证实的因果位点识别为最显著的,并且能清楚地识别出另一个样本中最强的候选位点。由于这三个指标(最显著的单个标记、复合似然性及其主成分)是相关的,每次测试选择n个最小的P值会在下一阶段给出<3n个区域以供后续研究。通过这种方式,对标记选择和密度有不同响应的方法被赋予了大致相等的权重,并在经济上进行了比较,而无需表达未经检验的偏见或为其中任何一种方法牺牲最显著的结果。大量的病例、对照和标记本身不足以控制I型和II型错误,因此在同时识别因果变异和优化设计方面,有效使用经过Bonferroni校正的多个指标有望具有重要价值。