Briollais Laurent, Wang Yuanyuan, Rajendram Isaac, Onay Venus, Shi Ellen, Knight Julia, Ozcelik Hilmi
Prosserman Centre for Health Research, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, M5T 3L9, Canada.
BMC Med. 2007 Aug 7;5:22. doi: 10.1186/1741-7015-5-22.
There is growing evidence that gene-gene interactions are ubiquitous in determining the susceptibility to common human diseases. The investigation of such gene-gene interactions presents new statistical challenges for studies with relatively small sample sizes as the number of potential interactions in the genome can be large. Breast cancer provides a useful paradigm to study genetically complex diseases because commonly occurring single nucleotide polymorphisms (SNPs) may additively or synergistically disturb the system-wide communication of the cellular processes leading to cancer development.
In this study, we systematically studied SNP-SNP interactions among 19 SNPs from 18 key genes involved in major cancer pathways in a sample of 398 breast cancer cases and 372 controls from Ontario. We discuss the methodological issues associated with the detection of SNP-SNP interactions in this dataset by applying and comparing three commonly used methods: the logistic regression model, classification and regression trees (CART), and the multifactor dimensionality reduction (MDR) method.
Our analyses show evidence for several simple (two-way) and complex (multi-way) SNP-SNP interactions associated with breast cancer. For example, all three methods identified XPD-[Lys751Gln]*IL10-[G(-1082)A] as the most significant two-way interaction. CART and MDR identified the same critical SNPs participating in complex interactions. Our results suggest that the use of multiple statistical approaches (or an integrated approach) rather than a single methodology could be the best strategy to elucidate complex gene interactions that have generally very different patterns.
The strategy used here has the potential to identify complex biological relationships among breast cancer genes and processes. This will lead to the discovery of novel biological information, which will improve breast cancer risk management.
越来越多的证据表明,基因-基因相互作用在决定人类常见疾病易感性方面普遍存在。对于样本量相对较小的研究而言,此类基因-基因相互作用的研究带来了新的统计学挑战,因为基因组中潜在相互作用的数量可能很大。乳腺癌为研究遗传复杂性疾病提供了一个有用的范例,因为常见的单核苷酸多态性(SNP)可能会累加或协同干扰导致癌症发生的细胞过程的全系统通讯。
在本研究中,我们系统地研究了来自安大略省的398例乳腺癌病例和372例对照样本中,参与主要癌症通路的18个关键基因的19个SNP之间的SNP-SNP相互作用。我们通过应用和比较三种常用方法:逻辑回归模型、分类与回归树(CART)以及多因素降维(MDR)方法,讨论了该数据集中与检测SNP-SNP相互作用相关的方法学问题。
我们的分析显示了与乳腺癌相关的几种简单(双向)和复杂(多向)SNP-SNP相互作用的证据。例如,所有三种方法都将XPD-[Lys751Gln]*IL10-[G(-1082)A]确定为最显著的双向相互作用。CART和MDR确定了参与复杂相互作用的相同关键SNP。我们的结果表明,使用多种统计方法(或综合方法)而非单一方法可能是阐明通常具有非常不同模式的复杂基因相互作用的最佳策略。
此处使用的策略有可能识别乳腺癌基因和过程之间复杂的生物学关系。这将导致发现新的生物学信息,从而改善乳腺癌风险管理。