Section on Statistical Genetics, Department of Biostatistics, University of Alabama at Birmingham, Birmingham, Alabama, United States of America.
PLoS One. 2012;7(7):e40964. doi: 10.1371/journal.pone.0040964. Epub 2012 Jul 25.
Genetic factors are believed to account for 25% of the interindividual differences in Years of Life (YL) among humans. However, the genetic loci that have thus far been found to be associated with YL explain a very small proportion of the expected genetic variation in this trait, perhaps reflecting the complexity of the trait and the limitations of traditional association studies when applied to traits affected by a large number of small-effect genes. Using data from the Framingham Heart Study and statistical methods borrowed largely from the field of animal genetics (whole-genome prediction, WGP), we developed a WGP model for the study of YL and evaluated the extent to which thousands of genetic variants across the genome examined simultaneously can be used to predict interindividual differences in YL. We find that a sizable proportion of differences in YL--which were unexplained by age at entry, sex, smoking and BMI--can be accounted for and predicted using WGP methods. The contribution of genomic information to prediction accuracy was even higher than that of smoking and body mass index (BMI) combined; two predictors that are considered among the most important life-shortening factors. We evaluated the impacts of familial relationships and population structure (as described by the first two marker-derived principal components) and concluded that in our dataset population structure explained partially, but not fully the gains in prediction accuracy obtained with WGP. Further inspection of prediction accuracies by age at death indicated that most of the gains in predictive ability achieved with WGP were due to the increased accuracy of prediction of early mortality, perhaps reflecting the ability of WGP to capture differences in genetic risk to deadly diseases such as cancer, which are most often responsible for early mortality in our sample.
遗传因素被认为占人类寿命(YL)个体间差异的 25%。然而,迄今为止发现与 YL 相关的遗传位点仅能解释该性状预期遗传变异的很小一部分,这可能反映了性状的复杂性,以及传统关联研究应用于受大量微效基因影响的性状时存在的局限性。我们使用弗雷明汉心脏研究(Framingham Heart Study)的数据和主要借鉴自动物遗传学领域的统计方法(全基因组预测,WGP),为 YL 研究开发了一种 WGP 模型,并评估了同时研究基因组中数千个遗传变异对预测 YL 个体间差异的程度。我们发现,使用 WGP 方法可以解释和预测 YL 差异的相当大一部分,这些差异不能用进入研究时的年龄、性别、吸烟和 BMI 来解释。基因组信息对预测准确性的贡献甚至高于吸烟和体重指数(BMI)的总和;这两个预测因素被认为是最重要的缩短寿命因素之一。我们评估了家族关系和群体结构(由前两个标记衍生的主成分描述)的影响,并得出结论,在我们的数据集中国群结构部分解释了,但不是完全解释了 WGP 获得的预测准确性的提高。进一步按死亡年龄检查预测准确性表明,WGP 获得的预测能力的大部分提高归因于早期死亡率预测准确性的提高,这可能反映了 WGP 捕捉致命疾病(如癌症)遗传风险差异的能力,这些疾病在我们的样本中通常是导致早期死亡的主要原因。