Department of Twin Research and Genetic Epidemiology, King's College London, London, SE1 7EH, UK.
School of Mathematics and Statistics, The Open University, Milton Keynes, MK7 6AA, UK.
Hum Mol Genet. 2023 Aug 7;32(16):2638-2645. doi: 10.1093/hmg/ddad093.
Type 2 diabetes (T2D) is a heterogeneous illness caused by genetic and environmental factors. Previous genome-wide association studies (GWAS) have identified many genetic variants associated with T2D and found evidence of differing genetic profiles by age-at-onset. This study seeks to explore further the genetic and environmental drivers of T2D by analyzing subgroups on the basis of age-at-onset of diabetes and body mass index (BMI). In the UK Biobank, 36 494 T2D cases were stratified into three subgroups, and GWAS was performed for all T2D cases and for each subgroup relative to 421 021 controls. Altogether, 18 single nucleotide polymorphisms were significantly associated with T2D genome-wide in one or more subgroups and also showed evidence of heterogeneity between the subgroups (Cochrane's Q P < 0.01), with two SNPs remaining significant after multiple testing (in CDKN2B and CYTIP). Combined risk scores, on the basis of genetic profile, BMI and age, resulted in excellent diabetes prediction [area under the ROC curve (AUC) = 0.92]. A modest improvement in prediction (AUC = 0.93) was seen when the contribution of genetic and environmental factors was evaluated separately for each subgroup. Increasing sample sizes of genetic studies enables us to stratify disease cases into subgroups, which have sufficient power to highlight areas of genetic heterogeneity. Despite some evidence that optimizing combined risk scores by subgroup improves prediction, larger sample sizes are likely needed for prediction when using a stratification approach.
2 型糖尿病(T2D)是一种由遗传和环境因素引起的异质性疾病。先前的全基因组关联研究(GWAS)已经确定了许多与 T2D 相关的遗传变异,并发现了与发病年龄有关的不同遗传特征的证据。本研究旨在通过分析基于发病年龄和体重指数(BMI)的糖尿病亚组,进一步探讨 T2D 的遗传和环境驱动因素。在英国生物银行中,36494 例 T2D 病例被分为三个亚组,对所有 T2D 病例和每个亚组相对于 421021 例对照进行了 GWAS。共有 18 个单核苷酸多态性在一个或多个亚组中与 T2D 全基因组显著相关,并且在亚组之间也显示出异质性的证据(Cochrane's Q P < 0.01),在 CDKN2B 和 CYTIP 中两个 SNP 仍然具有统计学意义。基于遗传特征、BMI 和年龄的综合风险评分导致了极好的糖尿病预测[ROC 曲线下面积(AUC)= 0.92]。当分别评估每个亚组的遗传和环境因素的贡献时,预测得到了适度的改善(AUC = 0.93)。遗传研究的样本量增加使我们能够将疾病病例分为亚组,这些亚组具有足够的力量突出遗传异质性的领域。尽管有一些证据表明通过亚组优化综合风险评分可以提高预测效果,但当使用分层方法时,更大的样本量可能更有助于预测。