Department of Preventive Medicine, University of Southern California, 1540 Alcazar St., CHP-220, Los Angeles, CA 90089-9011, USA.
Hum Genomics. 2009 Oct;4(1):21-42. doi: 10.1186/1479-7364-4-1-21.
Candidate gene studies are generally motivated by some form of pathway reasoning in the selection of genes to be studied, but seldom has the logic of the approach been carried through to the analysis. Marginal effects of polymorphisms in the selected genes, and occasionally pairwise gene–gene or gene–environment interactions,are often presented, but a unified approach to modelling the entire pathway has been lacking. In this review, a variety of approaches to this problem is considered, focusing on hypothesis-driven rather than purely exploratory methods. Empirical modelling strategies are based on hierarchical models that allow prior knowledge about the structure of the pathway and the various reactions to be included as ‘prior covariates’. By contrast, mechanistic models aim to describe the reactions through a system of differential equations with rate parameters that can vary between individuals, based on their genotypes. Some ways of combining the two approaches are suggested and Bayesian model averaging methods for dealing with uncertainty about the true model form in either framework is discussed. Biomarker measurements can be incorporated into such analyses, and two-phase sampling designs stratified on some combination of disease, genes and exposures can be an efficient way of obtaining data that would be too expensive or difficult to obtain on a full candidate gene sample. The review concludes with some thoughts about potential uses of pathways in genome-wide association studies.
候选基因研究通常是基于某种途径推理选择要研究的基因,但这种方法的逻辑很少被应用于分析。通常会呈现所选基因中多态性的边际效应,偶尔还会呈现基因-基因或基因-环境相互作用的成对效应,但缺乏一种统一的方法来对整个途径进行建模。在这篇综述中,考虑了多种解决此问题的方法,重点是基于假设的方法,而不是纯粹的探索性方法。经验建模策略基于分层模型,允许将途径结构和各种反应的先验知识作为“先验协变量”包含在内。相比之下,机制模型旨在通过一个带有个体间可变化的速率参数的微分方程系统来描述反应,其取决于个体的基因型。本文还提出了一些结合这两种方法的方法,并讨论了在这两种框架中处理真实模型形式不确定性的贝叶斯模型平均方法。生物标志物测量可以纳入此类分析中,并且基于疾病、基因和暴露的某种组合进行分层的两阶段抽样设计可以是一种高效的方法,可以获得在全候选基因样本中过于昂贵或难以获得的数据。本文以关于途径在全基因组关联研究中的潜在用途的一些想法作为结论。