使用蒙特卡罗逻辑回归识别相互作用的单核苷酸多态性。

Identifying interacting SNPs using Monte Carlo logic regression.

作者信息

Kooperberg Charles, Ruczinski Ingo

机构信息

Division of Public Health Services, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109-1024, USA.

出版信息

Genet Epidemiol. 2005 Feb;28(2):157-70. doi: 10.1002/gepi.20042.

DOI:10.1002/gepi.20042

PMID:15532037

Abstract

Interactions are frequently at the center of interest in single-nucleotide polymorphism (SNP) association studies. When interacting SNPs are in the same gene or in genes that are close in sequence, such interactions may suggest which haplotypes are associated with a disease. Interactions between unrelated SNPs may suggest genetic pathways. Unfortunately, data sets are often still too small to definitively determine whether interactions between SNPs occur. Also, competing sets of interactions could often be of equal interest. Here we propose Monte Carlo logic regression, an exploratory tool that combines Markov chain Monte Carlo and logic regression, an adaptive regression methodology that attempts to construct predictors as Boolean combinations of binary covariates such as SNPs. The goal of Monte Carlo logic regression is to generate a collection of (interactions of) SNPs that may be associated with a disease outcome, and that warrant further investigation. As such, the models that are fitted in the Markov chain are not combined into a single model, as is often done in Bayesian model averaging procedures. Instead, the most frequently occurring patterns in these models are tabulated. The method is applied to a study of heart disease with 779 participants and 89 SNPs. A simulation study is carried out to investigate the performance of the Monte Carlo logic regression approach.

摘要

在单核苷酸多态性（SNP）关联研究中，相互作用常常是关注的核心。当相互作用的SNP位于同一基因或序列相近的基因中时，这种相互作用可能表明哪些单倍型与疾病相关。不相关SNP之间的相互作用可能暗示遗传途径。不幸的是，数据集往往仍然太小，无法明确确定SNP之间是否发生相互作用。此外，相互作用的竞争集合通常可能同样令人感兴趣。在此，我们提出蒙特卡洛逻辑回归，这是一种探索性工具，它将马尔可夫链蒙特卡洛和逻辑回归相结合，逻辑回归是一种自适应回归方法，试图将预测变量构建为二元协变量（如SNP）的布尔组合。蒙特卡洛逻辑回归的目标是生成一组可能与疾病结局相关且值得进一步研究的SNP（及其相互作用）。因此，在马尔可夫链中拟合的模型不像贝叶斯模型平均程序中通常那样被组合成一个单一模型。相反，会将这些模型中最常出现的模式制成表格。该方法应用于一项对779名参与者和89个SNP的心脏病研究。还进行了一项模拟研究以调查蒙特卡洛逻辑回归方法的性能。