Department of Psychiatry, University of California, Los Angeles, California, USA.
Genet Epidemiol. 2009;33 Suppl 1(Suppl 1):S93-8. doi: 10.1002/gepi.20479.
Participants analyzed actual and simulated longitudinal data from the Framingham Heart Study for various metabolic and cardiovascular traits. The genetic information incorporated into these investigations ranged from selected single-nucleotide polymorphisms to genome-wide association arrays. Genotypes were incorporated using a broad range of methodological approaches including conditional logistic regression, linear mixed models, generalized estimating equations, linear growth curve estimation, growth modeling, growth mixture modeling, population attributable risk fraction based on survival functions under the proportional hazards models, and multivariate adaptive splines for the analysis of longitudinal data. The specific scientific questions addressed by these different approaches also varied, ranging from a more precise definition of the phenotype, bias reduction in control selection, estimation of effect sizes and genotype associated risk, to direct incorporation of genetic data into longitudinal modeling approaches and the exploration of population heterogeneity with regard to longitudinal trajectories. The group reached several overall conclusions: (1) The additional information provided by longitudinal data may be useful in genetic analyses. (2) The precision of the phenotype definition as well as control selection in nested designs may be improved, especially if traits demonstrate a trend over time or have strong age-of-onset effects. (3) Analyzing genetic data stratified for high-risk subgroups defined by a unique development over time could be useful for the detection of rare mutations in common multifactorial diseases. (4) Estimation of the population impact of genomic risk variants could be more precise. The challenges and computational complexity demanded by genome-wide single-nucleotide polymorphism data were also discussed.
参与者分析了弗雷明汉心脏研究的实际和模拟纵向数据,以研究各种代谢和心血管特征。这些研究中纳入的遗传信息范围从选定的单核苷酸多态性到全基因组关联数组。基因型的纳入使用了广泛的方法学方法,包括条件逻辑回归、线性混合模型、广义估计方程、线性生长曲线估计、生长建模、生长混合建模、基于比例风险模型下生存函数的人群归因风险分数以及用于分析纵向数据的多元自适应样条。这些不同方法解决的具体科学问题也有所不同,从更精确地定义表型、减少对照选择偏差、估计效应大小和基因型相关风险,到直接将遗传数据纳入纵向建模方法以及探索与纵向轨迹有关的人群异质性。该小组得出了几个总体结论:(1)纵向数据提供的额外信息可能对遗传分析有用。(2)嵌套设计中表型定义和对照选择的精度可以提高,特别是如果特征随时间呈现趋势或具有强烈的发病年龄效应。(3)通过随时间变化的独特发展来对高风险亚组进行分层分析遗传数据可能有助于检测常见多因素疾病中的罕见突变。(4)基因组风险变异的人群影响估计可能更加精确。还讨论了全基因组单核苷酸多态性数据所带来的挑战和计算复杂性。