Section of Epidemiology, Department of Public Health, University of Copenhagen, Copenhagen, Denmark
Novo Nordisk Foundation Center for Protein Research, Translational Disease Systems Biology Group, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark.
BMJ Open Diabetes Res Care. 2021 Mar;9(1). doi: 10.1136/bmjdrc-2020-001953.
Although various lipid and non-lipid analytes measured by nuclear magnetic resonance (NMR) spectroscopy have been associated with type 2 diabetes, a structured comparison of the ability of NMR-derived biomarkers and standard lipids to predict individual diabetes risk has not been undertaken in larger studies nor among individuals at high risk of diabetes.
Cumulative discriminative utilities of various groups of biomarkers including NMR lipoproteins, related non-lipid biomarkers, standard lipids, and demographic and glycemic traits were compared for short-term (3.2 years) and long-term (15 years) diabetes development in the Diabetes Prevention Program, a multiethnic, placebo-controlled, randomized controlled trial of individuals with pre-diabetes in the USA (N=2590). Logistic regression, Cox proportional hazards model and six different hyperparameter-tuned machine learning algorithms were compared. The Matthews Correlation Coefficient (MCC) was used as the primary measure of discriminative utility.
Models with baseline NMR analytes and their changes did not improve the discriminative utility of simpler models including standard lipids or demographic and glycemic traits. Across all algorithms, models with baseline 2-hour glucose performed the best (max MCC=0.36). Sophisticated machine learning algorithms performed similarly to logistic regression in this study.
NMR lipoproteins and related non-lipid biomarkers were associated but did not augment discrimination of diabetes risk beyond traditional diabetes risk factors except for 2-hour glucose. Machine learning algorithms provided no meaningful improvement for discrimination compared with logistic regression, which suggests a lack of influential latent interactions among the analytes assessed in this study.
Diabetes Prevention Program: NCT00004992; Diabetes Prevention Program Outcomes Study: NCT00038727.
尽管通过磁共振(NMR)光谱测量的各种脂质和非脂质分析物与 2 型糖尿病有关,但尚未在更大规模的研究中或在糖尿病高危人群中对 NMR 衍生生物标志物和标准脂质预测个体糖尿病风险的能力进行结构化比较。
在糖尿病预防计划中,比较了各种生物标志物组(包括 NMR 脂蛋白、相关非脂类生物标志物、标准脂质以及人口统计学和血糖特征)的累积判别效用,以预测短期(3.2 年)和长期(15 年)糖尿病发展。该计划是一项在美国进行的、针对有前驱糖尿病的多民族、安慰剂对照、随机对照试验,参与者有 2590 人。比较了逻辑回归、Cox 比例风险模型和 6 种不同的超参数调整机器学习算法。马修斯相关系数(MCC)用作判别效用的主要衡量标准。
基线 NMR 分析物及其变化的模型并未改善包括标准脂质或人口统计学和血糖特征在内的更简单模型的判别效用。在所有算法中,基线 2 小时血糖模型表现最佳(最大 MCC=0.36)。在这项研究中,复杂的机器学习算法与逻辑回归的表现相似。
NMR 脂蛋白和相关非脂类生物标志物与糖尿病风险相关,但除了 2 小时血糖外,与传统的糖尿病危险因素相比,并未增加对糖尿病风险的判别。与逻辑回归相比,机器学习算法在判别方面没有提供有意义的改进,这表明在所评估的分析物中缺乏有影响力的潜在相互作用。
糖尿病预防计划:NCT00004992;糖尿病预防计划结果研究:NCT00038727。