Hawkes Gareth, Yengo Loic, Vedantam Sailaja, Marouli Eirini, Beaumont Robin N, Tyrrell Jessica, Weedon Michael N, Hirschhorn Joel, Frayling Timothy M, Wood Andrew R
Genetics of Complex Traits, College of Medicine and Health, University of Exeter, Exeter, Devon, UK.
Institute for Molecular Bioscience, The University of Queensland, Brisbane, Australia.
bioRxiv. 2023 Feb 10:2023.02.10.528019. doi: 10.1101/2023.02.10.528019.
Findings from genome-wide association studies have facilitated the generation of genetic predictors for many common human phenotypes. Stratifying individuals misaligned to a genetic predictor based on common variants may be important for follow-up studies that aim to identify alternative causal factors. Using genome-wide imputed genetic data, we aimed to classify 158,951 unrelated individuals from the UK Biobank as either concordant or deviating from two well-measured phenotypes. We first applied our methods to standing height: our primary analysis classified 244 individuals (0.15%) as misaligned to their genetically predicted height. We show that these individuals are enriched for self-reporting being shorter or taller than average at age 10, diagnosed congenital malformations, and rare loss-of-function variants in genes previously catalogued as causal for growth disorders. Secondly, we apply our methods to LDL cholesterol. We classified 156 (0.12%) individuals as misaligned to their genetically predicted LDL cholesterol and show that these individuals were enriched for both clinically actionable cardiovascular risk factors and rare genetic variants in genes previously shown to be involved in metabolic processes. Individuals whose LDL-C was higher than expected based on the genetic predictor were also at higher risk of developing coronary artery disease and type-two diabetes, even after adjustment for measured LDL-C, BMI and age, suggesting upward deviation from genetically predicted LDL-C is indicative of generally poor health. Our results remained broadly consistent when performing sensitivity analysis based on a variety of parametric and non-parametric methods to define individuals deviating from polygenic expectation. Our analyses demonstrate the potential importance of quantitatively identifying individuals for further follow-up based on deviation from genetic predictions.
Human genetics is becoming increasingly useful to help predict human traits across a population owing to findings from large-scale genetic association studies and advances in the power of genetic predictors. This provides an opportunity to potentially identify individuals that deviate from genetic predictions for a common phenotype under investigation. For example, an individual may be genetically predicted to be tall, but be shorter than expected. It is potentially important to identify individuals who deviate from genetic predictions as this can facilitate further follow-up to assess likely causes. Using 158,951 unrelated individuals from the UK Biobank, with height and LDL cholesterol, as exemplar traits, we demonstrate that approximately 0.15% & 0.12% of individuals deviate from their genetically predicted phenotypes respectively. We observed these individuals to be enriched for a range of rare clinical diagnoses, as well as rare genetic factors that may be causal. Our analyses also demonstrate several methods for detecting individuals who deviate from genetic predictions that can be applied to a range of continuous human phenotypes.
全基因组关联研究的结果推动了针对许多常见人类表型的遗传预测指标的生成。根据常见变异对与遗传预测指标不一致的个体进行分层,对于旨在识别其他因果因素的后续研究可能很重要。利用全基因组推算的遗传数据,我们旨在将英国生物银行的158,951名无亲属关系的个体分类为与两种测量良好的表型一致或偏离。我们首先将我们的方法应用于身高:我们的初步分析将244名个体(0.15%)分类为与其遗传预测身高不一致。我们表明,这些个体在10岁时自我报告身高高于或低于平均水平、被诊断患有先天性畸形以及在先前被列为生长障碍病因的基因中存在罕见的功能丧失变异方面更为富集。其次,我们将我们的方法应用于低密度脂蛋白胆固醇。我们将156名个体(0.12%)分类为与其遗传预测的低密度脂蛋白胆固醇不一致,并表明这些个体在临床上可采取行动的心血管危险因素以及先前已证明参与代谢过程的基因中的罕见遗传变异方面更为富集。即使在对测量的低密度脂蛋白胆固醇、体重指数和年龄进行调整后,基于遗传预测指标其低密度脂蛋白胆固醇高于预期的个体患冠状动脉疾病和2型糖尿病的风险也更高,这表明与遗传预测的低密度脂蛋白胆固醇向上偏离表明总体健康状况较差。当基于各种参数和非参数方法进行敏感性分析以定义偏离多基因预期的个体时,我们的结果大致保持一致。我们的分析证明了基于与遗传预测的偏差定量识别个体以进行进一步随访的潜在重要性。
由于大规模遗传关联研究的结果以及遗传预测指标效力的提高,人类遗传学在帮助预测整个人口中的人类特征方面变得越来越有用。这提供了一个机会,有可能识别出在正在研究的常见表型上偏离遗传预测的个体。例如,一个个体在遗传上可能被预测为高个子,但实际身高却低于预期。识别偏离遗传预测的个体可能很重要,因为这有助于进一步随访以评估可能的原因。以身高和低密度脂蛋白胆固醇作为示例特征,我们利用来自英国生物银行的158,951名无亲属关系的个体证明,分别约有0.15%和0.12%的个体偏离了他们的遗传预测表型。我们观察到这些个体在一系列罕见的临床诊断以及可能具有因果关系的罕见遗传因素方面更为富集。我们的分析还展示了几种检测偏离遗传预测个体的方法,这些方法可应用于一系列连续的人类表型。