Beak Woosun, Park Jihun, Ji Suk
Department of Dental Public Health, Ajou University Graduate School of Clinical Dentistry, Suwon, Republic of Korea.
Department of Dentistry, Gyeonggi Provincial Medical Center Suwon Hospital, Suwon, Republic of Korea.
Heliyon. 2024 Jun 5;10(11):e32496. doi: 10.1016/j.heliyon.2024.e32496. eCollection 2024 Jun 15.
This study aimed to investigate the performance and reliability of data-driven models employing correlational feature analysis and clinical validation for predicting periodontal disease.
The 7th Korea National Health and Nutrition Examination Survey ( = 10,654) was used for correlation analysis to identify significant risk factors for periodontitis. Periodontal prediction models were developed with the selected factors and database, followed by internal validation with 5-fold cross-validation and 1000 bootstrap resampling. External validation was conducted with clinical data ( = 120) collected through self-reported questionnaires, clinical periodontal parameters, and radiographic image analysis. Predictive performance was assessed for logistics regression, support vector machine, random forest, XGBoost, and neural network algorithms using the area under the receiver operating characteristic curves (AUC) and other performance metrics.
Correlation analysis identified 16 features from over 1000 potential risk factors for periodontitis. The best data-driven model (XGBoost) showed AUC values of 0.823 and 0.796 for internal and external validations, respectively. Modeling with clinical data revealed those same measures to be 0.836 and 0.649, respectively. In addition, the data-driven model could predict other clinical periodontal parameters including severe bone loss (AUC = 0.813), gingival bleeding (AUC = 0.694), and tooth loss (AUC = 0.734). A patient case study about prognostic predictions revealed that the probability of periodontitis can be reduced by 6.0 % (stop smoking) and 0.6 % (stop drinking) on average.
Data-driven models for predicting periodontitis and other periodontal parameters were developed from 16 risk factors, demonstrating enhanced prediction performance and reproducibility in internal-external validations.
本研究旨在调查采用相关特征分析和临床验证的数据驱动模型在预测牙周疾病方面的性能和可靠性。
使用第七次韩国国民健康与营养检查调查(n = 10654)进行相关性分析,以确定牙周炎的重要风险因素。利用选定的因素和数据库开发牙周预测模型,随后通过五折交叉验证和1000次自助重采样进行内部验证。通过自我报告问卷、临床牙周参数和影像学图像分析收集的临床数据(n = 120)进行外部验证。使用受试者操作特征曲线下面积(AUC)和其他性能指标评估逻辑回归、支持向量机、随机森林、XGBoost和神经网络算法的预测性能。
相关性分析从1000多个潜在的牙周炎风险因素中确定了16个特征。最佳数据驱动模型(XGBoost)在内部和外部验证中的AUC值分别为0.823和0.796。用临床数据建模显示这些指标分别为0.836和0.649。此外,数据驱动模型可以预测其他临床牙周参数,包括严重骨丧失(AUC = 0.813)、牙龈出血(AUC = 0.694)和牙齿脱落(AUC = 0.734)。一项关于预后预测的患者案例研究表明,平均而言,牙周炎的概率可通过戒烟降低6.0%,通过戒酒降低0.6%。
基于16个风险因素开发了用于预测牙周炎和其他牙周参数的数据驱动模型,在内部-外部验证中显示出增强的预测性能和可重复性。