Becker Tim, Schumacher Johannes, Cichon Sven, Baur Max P, Knapp Michael
Institute for Medical Biometry, Informatics and Epidemiology, University of Bonn, Bonn, Germany.
Genet Epidemiol. 2005 Dec;29(4):313-22. doi: 10.1002/gepi.20096.
Genetically complex diseases are caused by interacting environmental factors and genes. As a consequence, statistical methods that consider multiple unlinked genomic regions simultaneously are desirable. Such consideration, however, may lead to a vast number of different high-dimensional tests whose appropriate analysis pose a problem. Here, we present a method to analyze case-control studies with multiple SNP data without phase information that considers gene-gene interaction effects while correcting appropriately for multiple testing. In particular, we allow for interactions of haplotypes that belong to different unlinked regions, as haplotype analysis often proves to be more powerful than single marker analysis. In addition, we consider different marker combinations at each unlinked region. The multiple testing issue is settled via the minP approach; the P value of the "best" marker/region configuration is corrected via Monte-Carlo simulations. Thus, we do not explicitly test for a specific pre-defined interaction model, but test for the global hypothesis that none of the considered haplotype interactions shows association with the disease. We carry out a simulation study for case-control data that confirms the validity of our approach. When simulating two-locus disease models, our test proves to be more powerful than association methods that analyze each linked region separately. In addition, when one of the tested regions is not involved in the etiology of the disease, only a small amount of power is lost with interaction analysis as compared to analysis without interaction. We successfully applied our method to a real case-control data set with markers from two genes controlling a common pathway. While classical analysis failed to reach significance, we obtained a significant result even after correction for multiple testing with our proposed haplotype interaction analysis. The method described here has been implemented in FAMHAP.
基因复杂疾病是由环境因素和基因相互作用引起的。因此,需要同时考虑多个非连锁基因组区域的统计方法。然而,这种考虑可能会导致大量不同的高维检验,对其进行适当分析存在问题。在这里,我们提出了一种方法来分析无相位信息的多个单核苷酸多态性(SNP)数据的病例对照研究,该方法在适当校正多重检验的同时考虑基因-基因相互作用效应。特别是,我们允许属于不同非连锁区域的单倍型之间发生相互作用,因为单倍型分析通常比单标记分析更具效力。此外,我们考虑每个非连锁区域的不同标记组合。多重检验问题通过最小P值(minP)方法解决;“最佳”标记/区域配置的P值通过蒙特卡罗模拟进行校正。因此,我们没有明确检验特定的预定义相互作用模型,而是检验全局假设,即所考虑的单倍型相互作用均与疾病无关联。我们对病例对照数据进行了模拟研究,证实了我们方法的有效性。在模拟两位点疾病模型时,我们的检验比分别分析每个连锁区域的关联方法更具效力。此外,当其中一个被检验区域与疾病病因无关时,与无相互作用分析相比,相互作用分析仅损失少量效力。我们成功地将我们的方法应用于一个真实的病例对照数据集,该数据集包含来自控制一条共同通路的两个基因的标记。虽然经典分析未达到显著水平,但我们通过提出的单倍型相互作用分析进行多重检验校正后仍获得了显著结果。这里描述的方法已在FAMHAP中实现。