Wu Xuelun, Zhai Furui, Chang Ailing, Wei Jing, Guo Yanan, Zhang Jincheng
Department of Endocrinology, Cangzhou Central Hospital, Cangzhou City, Hebei Province, People's Republic of China.
Gynecological Clinic, Cangzhou Central Hospital, Cangzhou City, Hebei Province, People's Republic of China.
Diabetes Metab Syndr Obes. 2023 Jun 30;16:1987-2003. doi: 10.2147/DMSO.S406695. eCollection 2023.
Diagnosing osteoporosis in T2DM based on bone mineral density (BMD) remains challenging. We sought to develop prediction models employing machine learning algorithms for use as screening instruments for osteoporosis in T2DM patients.
Data were collected from 433 participants and analyzed using nine categorical machine learning algorithms to select features based on demographic and clinical variables. Multiple classification models were compared using the area under the receiver operating characteristic curve (ROC-AUC), accuracy, sensitivity, specificity, the average precision (AP), precision, F1 score, precision-recall curves, calibration plots, and decision curve analysis (DCA) to determine the best model. In addition, 5-fold cross-validation was utilized to optimize the model, followed by an evaluation of feature significance using Shapley Additive exPlanations (SHAP). Using latent class analysis (LCA), distinct subpopulations were identified by constructing several discrete clusters.
In this study, nine feature variables were identified to construct predictive models for osteoporosis in individuals with T2DM. The machine learning algorithms achieved an AP range of 0.444-1.000. The XGBoost model was selected as the final prediction model with an AUROC of 0.940 in the training set, 0.772 in the validation set for 5-fold cross-validation, and 0.872 in the test set. Using SHAP methodology, 25(OH)D was identified as the most important risk factor. Additionally, a 3-Class model was constructed using LCA, which categorized individuals into high, medium, and low-risk groups.
Our study developed a predictive model with high accuracy and clinical validity for predicting osteoporosis in type 2 diabetes patients. We also identified three subpopulations with varying osteoporosis risk using clustering. However, limited sample size warrants cautious interpretation of results, and validation in larger cohorts is needed.
基于骨密度(BMD)诊断2型糖尿病(T2DM)患者的骨质疏松症仍然具有挑战性。我们试图开发使用机器学习算法的预测模型,用作T2DM患者骨质疏松症的筛查工具。
收集了433名参与者的数据,并使用九种分类机器学习算法进行分析,以根据人口统计学和临床变量选择特征。使用受试者操作特征曲线下面积(ROC-AUC)、准确性、敏感性、特异性、平均精度(AP)、精度、F1分数、精确召回曲线、校准图和决策曲线分析(DCA)比较多个分类模型,以确定最佳模型。此外,采用5折交叉验证对模型进行优化,随后使用Shapley加性解释(SHAP)评估特征重要性。使用潜在类别分析(LCA),通过构建几个离散聚类来识别不同的亚群。
在本研究中,确定了九个特征变量来构建T2DM个体骨质疏松症的预测模型。机器学习算法的AP范围为0.444-1.000。XGBoost模型被选为最终预测模型,在训练集中的AUROC为0.940,在5折交叉验证的验证集中为0.772,在测试集中为0.872。使用SHAP方法,确定25(OH)D是最重要的危险因素。此外,使用LCA构建了一个3类模型,将个体分为高、中、低风险组。
我们的研究开发了一种预测模型,用于预测2型糖尿病患者的骨质疏松症,具有较高的准确性和临床有效性。我们还通过聚类识别了三个骨质疏松症风险不同的亚群。然而,样本量有限,结果解释需谨慎,需要在更大的队列中进行验证。