Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Miaoli, Taiwan, ROC.
Genet Epidemiol. 2012 Feb;36(2):88-98. doi: 10.1002/gepi.21602.
Gene-gene interaction plays an important role in the etiology of complex diseases, which may exist without a genetic main effect. Most current statistical approaches, however, focus on assessing an interaction effect in the presence of the gene's main effects. It would be very helpful to develop methods that can detect not only the gene's main effects but also gene-gene interaction effects regardless of the existence of the gene's main effects while adjusting for confounding factors. In addition, when a disease variant is rare or when the sample size is quite limited, the statistical asymptotic properties are not applicable; therefore, approaches based on a reasonable and applicable computational framework would be practical and frequently applied. In this study, we have developed an extended support vector machine (SVM) method and an SVM-based pedigree-based generalized multifactor dimensionality reduction (PGMDR) method to study interactions in the presence or absence of main effects of genes with an adjustment for covariates using limited samples of families. A new test statistic is proposed for classifying the affected and the unaffected in the SVM-based PGMDR approach to improve performance in detecting gene-gene interactions. Simulation studies under various scenarios have been performed to compare the performances of the proposed and the original methods. The proposed and original approaches have been applied to a real data example for illustration and comparison. Both the simulation and real data studies show that the proposed SVM and SVM-based PGMDR methods have great prediction accuracies, consistencies, and power in detecting gene-gene interactions.
基因-基因相互作用在复杂疾病的病因学中起着重要作用,这种相互作用可能在没有遗传主效应的情况下存在。然而,大多数当前的统计方法都侧重于评估在存在基因主效应的情况下的相互作用效应。开发一种方法来检测基因的主效应和基因-基因相互作用效应,而无需考虑基因的主效应,同时调整混杂因素,这将非常有帮助。此外,当疾病变异很罕见或样本量非常有限时,统计渐近性质不适用;因此,基于合理和适用的计算框架的方法将是实用的,并经常应用。在这项研究中,我们开发了一种扩展支持向量机(SVM)方法和一种基于 SVM 的基于家系的广义多因子降维(PGMDR)方法,用于在调整协变量的情况下,使用有限的家系样本研究基因的主效应存在或不存在时的相互作用。提出了一种新的测试统计量,用于在基于 SVM 的 PGMDR 方法中对受影响和未受影响的个体进行分类,以提高检测基因-基因相互作用的性能。在各种场景下进行了模拟研究,以比较所提出的和原始方法的性能。将所提出的和原始的方法应用于一个真实数据示例进行说明和比较。模拟和真实数据研究均表明,所提出的 SVM 和基于 SVM 的 PGMDR 方法在检测基因-基因相互作用方面具有很高的预测准确性、一致性和功效。