Fan Ruzong, Zhang Yiwei, Albert Paul S, Liu Aiyi, Wang Yuanjia, Xiong Momiao
Biostatistics and Bioinformatics Branch, Division of Epidemiology, Statistics and Prevention Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Rockville, Maryland.
Genet Epidemiol. 2012 Dec;36(8):856-69. doi: 10.1002/gepi.21673. Epub 2012 Sep 10.
Longitudinal genetic studies provide a valuable resource for exploring key genetic and environmental factors that affect complex traits over time. Genetic analysis of longitudinal data that incorporate temporal variations is important for understanding genetic architecture and biological variations of common complex diseases. Although they are important, there is a paucity of statistical methods to analyze longitudinal human genetic data. In this article, longitudinal methods are developed for temporal association mapping to analyze population longitudinal data. Both parametric and nonparametric models are proposed. The models can be applied to multiple diallelic genetic markers such as single-nucleotide polymorphisms and multiallelic markers such as microsatellites. By analytical formulae, we show that the models take both the linkage disequilibrium and temporal trends into account simultaneously. Variance-covariance structure is constructed to model the single measurement variation and multiple measurement correlations of an individual based on the theory of stochastic processes. Novel penalized spline models are used to estimate the time-dependent mean functions and regression coefficients. The methods were applied to analyze Framingham Heart Study data of Genetic Analysis Workshop (GAW) 13 and GAW 16. The temporal trends and genetic effects of the systolic blood pressure are successfully detected by the proposed approaches. Simulation studies were performed to find out that the nonparametric penalized linear model is the best choice in fitting real data. The research sheds light on the important area of longitudinal genetic analysis, and it provides a basis for future methodological investigations and practical applications.
纵向遗传研究为探索随时间影响复杂性状的关键遗传和环境因素提供了宝贵资源。对纳入时间变化的纵向数据进行遗传分析,对于理解常见复杂疾病的遗传结构和生物学变异至关重要。尽管它们很重要,但用于分析纵向人类遗传数据的统计方法却很匮乏。在本文中,我们开发了纵向方法用于时间关联定位,以分析群体纵向数据。提出了参数模型和非参数模型。这些模型可应用于多个双等位基因遗传标记,如单核苷酸多态性,以及多等位基因标记,如微卫星。通过解析公式,我们表明这些模型同时考虑了连锁不平衡和时间趋势。基于随机过程理论构建方差协方差结构,以模拟个体的单次测量变异和多次测量相关性。使用新颖的惩罚样条模型来估计时间依赖均值函数和回归系数。这些方法被应用于分析遗传分析研讨会(GAW)13和GAW 16的弗雷明汉心脏研究数据。所提出的方法成功检测到了收缩压的时间趋势和遗传效应。进行模拟研究以发现非参数惩罚线性模型是拟合实际数据的最佳选择。该研究为纵向遗传分析这一重要领域提供了启示,并为未来的方法学研究和实际应用奠定了基础。