Wang Wenqiang, Mo Ruiqing, Chen Xingyu, Yang Sijie
Department of Plastic and Reconstructive Surgery, The People's Hospital of Guangxi Zhuang Autonomous Region & Research Center of Medical Sciences, Guangxi Academy of Medical Sciences, Nanning, Guangxi, China.
Department of Bone and Joint Surgery, Guangxi Diabetic Foot Salvage Engineering Research Center, The First Affiliated Hospital of Guangxi Medical University, Nanning, China.
Front Public Health. 2025 Aug 22;13:1606751. doi: 10.3389/fpubh.2025.1606751. eCollection 2025.
Obesity is a prevalent and clinically significant complication among individuals with diabetes mellitus (DM), contributing to increased cardiovascular risk, metabolic burden, and reduced quality of life. Despite its high prevalence, the risk factors for obesity within this population remain incompletely understood. With the growing availability of large-scale health datasets and advancements in machine learning, there is an opportunity to improve risk stratification. This study aimed to identify key predictors of obesity and develop a machine learning-based predictive model for patients with T2DM using data from the National Health and Nutrition Examination Survey (NHANES).
Data from adults with diabetes were extracted from the NHANES 2007-2018 cycles. Participants were categorized into obese and non-obese groups based on BMI. Least absolute shrinkage and selection operator (LASSO) regression with 10-fold cross-validation was used to select relevant features. Subsequently, nine machine learning algorithms-including logistic regression, random forest (RF), radial support vector machine (RSVM), k-nearest neighbors (KNN), XGBoost, LightGBM, decision tree (DT), elastic net regression (ENet), and multilayer perceptron (MLP)-were employed to construct predictive models. Model performance was evaluated based on area under the ROC curve (AUC), calibration curves, Brier score, and decision curve analysis (DCA). The best-performing model was visualized using a nomogram to enhance clinical applicability.
A total of 3,794 participants with type 2 diabetes were included in the analysis, of whom 57.0% were classified as obese. LASSO regression identified 19 key variables associated with obesity. Among the nine machine learning models evaluated, the logistic regression model demonstrated the best overall performance, with the lowest Brier score. It also showed good discrimination (AUC = 0.751 in the training set and 0.781 in the test set), favorable calibration, and consistent clinical utility based on decision curve analysis (DCA). A nomogram was constructed based on the logistic regression model to facilitate individualized risk prediction, with total points corresponding to predicted probabilities of obesity.
Obesity remains highly prevalent among individuals with type 2 diabetes. Our findings highlight key clinical features associated with obesity risk and provide a practical tool to aid in early identification and individualized management of high-risk patients.
肥胖是糖尿病(DM)患者中普遍存在且具有临床意义的并发症,会增加心血管疾病风险、代谢负担并降低生活质量。尽管其患病率很高,但该人群中肥胖的危险因素仍未完全了解。随着大规模健康数据集的日益丰富以及机器学习的进步,有机会改善风险分层。本研究旨在利用美国国家健康与营养检查调查(NHANES)的数据,确定肥胖的关键预测因素,并为2型糖尿病患者开发基于机器学习的预测模型。
从NHANES 2007 - 2018周期中提取糖尿病成年人的数据。根据体重指数(BMI)将参与者分为肥胖组和非肥胖组。使用具有10倍交叉验证的最小绝对收缩和选择算子(LASSO)回归来选择相关特征。随后,采用九种机器学习算法——包括逻辑回归、随机森林(RF)、径向支持向量机(RSVM)、k近邻(KNN)、XGBoost、LightGBM、决策树(DT)、弹性网络回归(ENet)和多层感知器(MLP)——构建预测模型。基于ROC曲线下面积(AUC)、校准曲线、布里尔评分和决策曲线分析(DCA)评估模型性能。使用列线图对表现最佳的模型进行可视化,以提高临床适用性。
共有3794名2型糖尿病参与者纳入分析,其中57.0%被归类为肥胖。LASSO回归确定了19个与肥胖相关的关键变量。在评估的九种机器学习模型中,逻辑回归模型表现出最佳的整体性能,布里尔评分最低。它还显示出良好的区分度(训练集中AUC = 0.751,测试集中AUC = 0.781)、良好的校准以及基于决策曲线分析(DCA)的一致临床效用。基于逻辑回归模型构建了列线图,以促进个性化风险预测,总分对应肥胖的预测概率。
肥胖在2型糖尿病患者中仍然非常普遍。我们的研究结果突出了与肥胖风险相关的关键临床特征,并提供了一个实用工具,有助于早期识别和对高危患者进行个性化管理。