Wang Peng, Yin Qiang, Ding Kangzhi, Zhong Huaichang, Jia Qundi, Xiao Zhasang, Xiong Hai
School of Medicine, Tibet University, Lhasa, 850000, China.
School of Ecology and Environment, Tibet University, Lhasa, 850000, China.
Sci Rep. 2025 Mar 31;15(1):10960. doi: 10.1038/s41598-025-95707-2.
The aim of this study was to establish the optimal prediction model by comparing the prediction effect of 6 kinds of prediction models containing biochemical indexes on the risk of osteoporosis in middle-aged and elderly women in Tibet. This study adopted a multi-stage cluster random sampling cross-sectional survey method. From January 2022 to January 2024, we obtained biochemical and bone mineral density (BMD) data from high altitudes in Tibet. We built a predictive model of osteoporosis in three steps. First, we performed feature selection to identify factors associated with osteoporosis. Next, the eligible participants were randomly divided into a training set and a test set in a ratio of 8:2. Then, the prediction model of osteoporosis was established based on Random Forest, ANN, XGB, and SVM. Finally, we compared the performance of the prediction models using sensitivity, specificity, and the area under the receiver operating characteristic curve (AUC) to select the best prediction model. Correlation analysis was used to screen indicators with statistical differences from T-score. Finally, Age (P < 0.01), LDL-C (P < 0.05), UA (P < 0.01), AST (P < 0.05), CREA (P < 0.01), BMI (P < 0.01), ALT (P < 0.01) were associated with osteoporosis. In train set, the order of AUC from highest to lowest is Random Forest (1.000), XGB (0.887), SVM (0.868), regression (0.801), ANN (0.793) and OSTA (0.739). In test set, the order of AUC from highest to lowest is XGB (0.848), regression (0.801), Random Forest (0.772), SVM (0.755), OSTA (0.739), ANN (0.732). SVM and XGB algorithm models had better screening effect on osteoporosis than OSTA in middle-aged and elderly Tibetan residents in Tibet. Compared with Random Forest, ANN and SVM, the established XGB model had the best prediction ability and can be used to predict the risk of osteoporosis on biochemical indexes. The model needs to be further improved through large sample research.
本研究旨在通过比较包含生化指标的6种预测模型对西藏中老年女性骨质疏松症风险的预测效果,建立最优预测模型。本研究采用多阶段整群随机抽样横断面调查方法。2022年1月至2024年1月,我们获取了西藏高海拔地区的生化和骨密度(BMD)数据。我们分三步构建了骨质疏松症预测模型。首先,进行特征选择以识别与骨质疏松症相关的因素。接下来,将符合条件的参与者按8:2的比例随机分为训练集和测试集。然后,基于随机森林、人工神经网络、极端梯度提升和支持向量机建立骨质疏松症预测模型。最后,我们使用灵敏度、特异度和受试者工作特征曲线下面积(AUC)比较预测模型的性能,以选择最佳预测模型。采用相关性分析从T值中筛选出有统计学差异的指标。最终,年龄(P<0.01)、低密度脂蛋白胆固醇(P<0.05)、尿酸(P<0.01)谷草转氨酶(P<0.05)、肌酐(P<0.01)、体重指数(P<0.01)、谷丙转氨酶(P<0.01)与骨质疏松症相关。在训练集中,AUC从高到低的顺序为随机森林(1.000)、极端梯度提升(0.887)、支持向量机(0.868)、回归(0.801)、人工神经网络(0.793)和骨质疏松症自我筛查工具(OSTA)指数(0.739)。在测试集中,AUC从高到低的顺序为极端梯度提升(0.848)、回归(0.801)、随机森林(0.772)、支持向量机(0.755)、骨质疏松症自我筛查工具指数(0.739)、人工神经网络(0.732)。在西藏中老年藏族居民中,支持向量机和极端梯度提升算法模型对骨质疏松症的筛查效果优于骨质疏松症自我筛查工具指数。与随机森林、人工神经网络和支持向量机相比,所建立极端梯度提升模型具有最佳预测能力,可用于基于生化指标预测骨质疏松症风险。该模型需要通过大样本研究进一步完善。