Gupte Trisha P, Azizi Zahra, Kho Pik Fang, Zhou Jiayan, Nzenkue Kevin, Chen Ming-Li, Panyard Daniel J, Guarischi-Sousa Rodrigo, Hilliard Austin T, Sharma Disha, Watson Kathleen, Abbasi Fahim, Tsao Philip S, Clarke Shoa L, Assimes Themistocles L
Department of Medicine, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA.
Meharry Medical College, Nashville, TN, USA.
medRxiv. 2024 Sep 15:2024.09.13.24313501. doi: 10.1101/2024.09.13.24313501.
AIMS/HYPOTHESIS: The plasma proteome holds promise as a diagnostic and prognostic tool that can accurately reflect complex human traits and disease processes. We assessed the ability of plasma proteins to predict type 2 diabetes mellitus (T2DM) and related traits.
Clinical, genetic, and high-throughput proteomic data from three subcohorts of UK Biobank participants were analyzed for association with dual-energy x-ray absorptiometry (DXA) derived truncal fat (in the adiposity subcohort), estimated maximum oxygen consumption (VOmax) (in the fitness subcohort), and incident T2DM (in the T2DM subcohort). We used least absolute shrinkage and selection operator (LASSO) regression to assess the relative ability of non-proteomic and proteomic variables to associate with each trait by comparing variance explained (R) and area under the curve (AUC) statistics between data types. Stability selection with randomized LASSO regression identified the most robustly associated proteins for each trait. The benefit of proteomic signatures (PSs) over QDiabetes, a T2DM clinical risk score, was evaluated through the derivation of delta (Δ) AUC values. We also assessed the incremental gain in model performance metrics using proteomic datasets with varying numbers of proteins. A series of two-sample Mendelian randomization (MR) analyses were conducted to identify potentially causal proteins for adiposity, fitness, and T2DM.
Across all three subcohorts, the mean age was 56.7 years and 54.9% were female. In the T2DM subcohort, 5.8% developed incident T2DM over a median follow-up of 7.6 years. LASSO-derived PSs increased the R of truncal fat and VOmax over clinical and genetic factors by 0.074 and 0.057, respectively. We observed a similar improvement in T2DM prediction over the QDiabetes score [Δ AUC: 0.016 (95% CI 0.008, 0.024)] when using a robust PS derived strictly from the T2DM outcome versus a model further augmented with non-overlapping proteins associated with adiposity and fitness. A small number of proteins (29 for truncal adiposity, 18 for VOmax, and 26 for T2DM) identified by stability selection algorithms offered most of the improvement in prediction of each outcome. Filtered and clustered versions of the full proteomic dataset supplied by the UK Biobank (ranging between 600-1,500 proteins) performed comparably to the full dataset for T2DM prediction. Using MR, we identified 4 proteins as potentially causal for adiposity, 1 as potentially causal for fitness, and 4 as potentially causal for T2DM.
CONCLUSIONS/INTERPRETATION: Plasma PSs modestly improve the prediction of incident T2DM over that possible with clinical and genetic factors. Further studies are warranted to better elucidate the clinical utility of these signatures in predicting the risk of T2DM over the standard practice of using the QDiabetes score. Candidate causally associated proteins identified through MR deserve further study as potential novel therapeutic targets for T2DM.
目的/假设:血浆蛋白质组有望成为一种诊断和预后工具,能够准确反映复杂的人类特征和疾病过程。我们评估了血浆蛋白预测2型糖尿病(T2DM)及相关特征的能力。
对英国生物银行参与者的三个亚队列的临床、遗传和高通量蛋白质组学数据进行分析,以研究其与双能X线吸收法(DXA)测定的躯干脂肪(肥胖亚队列)、估计最大摄氧量(VOmax)(体能亚队列)和新发T2DM(T2DM亚队列)之间的关联。我们使用最小绝对收缩和选择算子(LASSO)回归,通过比较不同数据类型之间的解释方差(R)和曲线下面积(AUC)统计量,评估非蛋白质组学和蛋白质组学变量与各特征的关联能力。采用随机LASSO回归进行稳定性选择,确定与各特征最密切相关的蛋白质。通过推导δ(Δ)AUC值,评估蛋白质组学特征(PSs)相对于T2DM临床风险评分QDiabetes的优势。我们还使用包含不同数量蛋白质的蛋白质组学数据集,评估模型性能指标的增量提升。进行了一系列两样本孟德尔随机化(MR)分析,以确定肥胖、体能和T2DM的潜在因果蛋白。
在所有三个亚队列中,平均年龄为56.7岁,女性占54.9%。在T2DM亚队列中,在中位随访7.6年期间,5.8%的人发生了新发T2DM。与临床和遗传因素相比,LASSO推导的PSs分别将躯干脂肪和VOmax的R提高了0.074和0.057。当使用严格从T2DM结果推导的稳健PS,而不是进一步增加与肥胖和体能相关的非重叠蛋白质的模型时,我们观察到T2DM预测相对于QDiabetes评分有类似的改善[Δ AUC:0.016(95% CI 0.008,0.024)]。通过稳定性选择算法确定的少数蛋白质(躯干肥胖29种、VOmax 18种、T2DM 26种)在各结局预测方面提供了大部分改善。英国生物银行提供的完整蛋白质组学数据集的过滤和聚类版本(包含600 - 1500种蛋白质)在T2DM预测方面与完整数据集表现相当。使用MR,我们确定了4种蛋白质可能是肥胖的因果蛋白,1种可能是体能的因果蛋白,4种可能是T2DM的因果蛋白。
结论/解读:血浆PSs在预测新发T2DM方面比临床和遗传因素略有改善。有必要进一步研究,以更好地阐明这些特征在预测T2DM风险方面相对于使用QDiabetes评分的标准做法的临床效用。通过MR确定的候选因果相关蛋白作为T2DM潜在的新型治疗靶点值得进一步研究。