Basu Saonli, Pan Wei, Oetting William S
Division of Biostatistics, University of Minnesota, Minneapolis, USA. saonli @ umn.edu
Hum Hered. 2011;71(4):234-45. doi: 10.1159/000328842. Epub 2011 Jul 6.
Studying one locus or one single nucleotide polymorphism (SNP) at a time may not be sufficient to understand complex diseases because they are unlikely to result from the effect of only one SNP. Each SNP alone may have little or no effect on the risk of the disease, but together they may increase the risk substantially. Analyses focusing on individual SNPs ignore the possibility of interaction among SNPs. In this paper, we propose a parsimonious model to assess the joint effect of a group of SNPs in a case-control study. The model implements a data reduction strategy within a likelihood framework and uses a test to assess the statistical significance of the effect of the group of SNPs on the binary trait. The primary advantage of the proposed approach is that the dimension reduction technique produces a test statistic with degrees of freedom significantly lower than a multiple logistic regression with only main effects of the SNPs, and our parsimonious model can incorporate the possibility of interaction among the SNPs. Moreover, the proposed approach estimates the direction of association of each SNP with the disease and provides an estimate of the average effect of the group of SNPs positively and negatively associated with the disease in the given SNP set. We illustrate the proposed model on simulated and real data, and compare its performance with a few other existing approaches. Our proposed approach appeared to outperform the other approaches for independent SNPs in our simulation studies.
一次研究一个基因座或一个单核苷酸多态性(SNP)可能不足以理解复杂疾病,因为它们不太可能仅由一个SNP的作用导致。每个单独的SNP对疾病风险可能几乎没有影响或完全没有影响,但它们共同作用时可能会大幅增加风险。专注于单个SNP的分析忽略了SNP之间相互作用的可能性。在本文中,我们提出了一个简约模型,用于在病例对照研究中评估一组SNP的联合效应。该模型在似然框架内实施数据约简策略,并使用一个检验来评估该组SNP对二元性状影响的统计显著性。所提出方法的主要优点是,降维技术产生的检验统计量的自由度显著低于仅具有SNP主效应的多元逻辑回归,并且我们的简约模型可以纳入SNP之间相互作用的可能性。此外,所提出的方法估计每个SNP与疾病关联的方向,并给出在给定SNP集中与疾病呈正相关和负相关的一组SNP的平均效应估计值。我们在模拟数据和真实数据上展示了所提出的模型,并将其性能与其他一些现有方法进行比较。在我们的模拟研究中,对于独立SNP,我们提出的方法似乎优于其他方法。