Department of Orthopaedics, Xinfeng County People's Hospital, Jiangxi, 341600, Xinfeng, China.
Department of ICU, GanZhou People's Hospital, GanZhou, 341000, Jiangxi, China.
Sci Rep. 2024 Mar 4;14(1):5245. doi: 10.1038/s41598-024-56114-1.
Osteoporosis is a major public health concern that significantly increases the risk of fractures. The aim of this study was to develop a Machine Learning based predictive model to screen individuals at high risk of osteoporosis based on chronic disease data, thus facilitating early detection and personalized management. A total of 10,000 complete patient records of primary healthcare data in the German Disease Analyzer database (IMS HEALTH) were included, of which 1293 diagnosed with osteoporosis and 8707 without the condition. The demographic characteristics and chronic disease data, including age, gender, lipid disorder, cancer, COPD, hypertension, heart failure, CHD, diabetes, chronic kidney disease, and stroke were collected from electronic health records. Ten different machine learning algorithms were employed to construct the predictive mode. The performance of the model was further validated and the relative importance of features in the model was analyzed. Out of the ten machine learning algorithms, the Stacker model based on Logistic Regression, AdaBoost Classifier, and Gradient Boosting Classifier demonstrated superior performance. The Stacker model demonstrated excellent performance through ten-fold cross-validation on the training set and ROC curve analysis on the test set. The confusion matrix, lift curve and calibration curves indicated that the Stacker model had optimal clinical utility. Further analysis on feature importance highlighted age, gender, lipid metabolism disorders, cancer, and COPD as the top five influential variables. In this study, a predictive model for osteoporosis based on chronic disease data was developed using machine learning. The model shows great potential in early detection and risk stratification of osteoporosis, ultimately facilitating personalized prevention and management strategies.
骨质疏松症是一个严重的公共健康问题,大大增加了骨折的风险。本研究旨在开发一种基于机器学习的预测模型,根据慢性病数据筛选出骨质疏松症高危个体,从而实现早期检测和个性化管理。我们纳入了德国疾病分析器数据库(IMS HEALTH)中 10000 例完整的初级保健数据患者记录,其中 1293 例被诊断为骨质疏松症,8707 例没有这种疾病。我们从电子健康记录中收集了人口统计学特征和慢性病数据,包括年龄、性别、血脂紊乱、癌症、COPD、高血压、心力衰竭、冠心病、糖尿病、慢性肾脏病和中风。我们使用了十种不同的机器学习算法来构建预测模型。进一步验证了模型的性能,并分析了模型中特征的相对重要性。在十种机器学习算法中,基于逻辑回归、AdaBoost 分类器和梯度提升分类器的 Stacker 模型表现出了优异的性能。Stacker 模型在训练集上的十折交叉验证和测试集上的 ROC 曲线分析中表现出了出色的性能。混淆矩阵、提升曲线和校准曲线表明 Stacker 模型具有最佳的临床实用性。进一步的特征重要性分析突出了年龄、性别、脂质代谢紊乱、癌症和 COPD 是五个最具影响力的变量。在这项研究中,我们使用机器学习开发了一种基于慢性病数据的骨质疏松症预测模型。该模型在骨质疏松症的早期检测和风险分层方面具有很大的潜力,最终有助于制定个性化的预防和管理策略。