Department of Orthodontics and Oral Facial Genetics, Indiana University School of Dentistry, Indianapolis, IN, US.
Indiana University School of Dentistry, Indianapolis, IN, US.
Int Orthod. 2023 Sep;21(3):100759. doi: 10.1016/j.ortho.2023.100759. Epub 2023 May 15.
The purpose of the present study was to create a machine learning (ML) algorithm with the ability to predict the extraction/non-extraction decision in a racially and ethnically diverse sample.
Data was gathered from the records of 393 patients (200 non-extraction and 193 extraction) from a racially and ethnically diverse population. Four ML models (logistic regression [LR], random forest [RF], support vector machine [SVM], and neural network [NN]) were trained on a training set (70% of samples) and then tested on the remaining samples (30%). The accuracy and precision of the ML model predictions were calculated using the area under the curve (AUC) of the receiver operating characteristics (ROC) curve. The proportion of correct extraction/non-extraction decisions was also calculated.
The LR, SVM, and NN models performed best, with an AUC of the ROC of 91.0%, 92.5%, and 92.3%, respectively. The overall proportion of correct decisions was 82%, 76%, 83%, and 81% for the LR, RF, SVM, and NN models, respectively. The features found to be most helpful to the ML algorithms in making their decisions were maxillary crowding/spacing, L1-NB (mm), U1-NA (mm), PFH:AFH, and SN-MP(̊), although many other features contributed significantly.
ML models can predict the extraction decision in a racially and ethnically diverse patient population with a high degree of accuracy and precision. Crowding, sagittal, and vertical characteristics all featured prominently in the hierarchy of components most influential to the ML decision-making process.
本研究的目的是创建一个具有预测能力的机器学习(ML)算法,以预测在种族和民族多样化的样本中是否拔牙。
数据来自种族和民族多样化人群中的 393 名患者(200 名不拔牙和 193 名拔牙)的记录。将四个 ML 模型(逻辑回归[LR]、随机森林[RF]、支持向量机[SVM]和神经网络[NN])在训练集(70%的样本)上进行训练,然后在其余样本(30%)上进行测试。使用接收者操作特征(ROC)曲线的曲线下面积(AUC)计算 ML 模型预测的准确性和精度。还计算了正确拔牙/不拔牙决策的比例。
LR、SVM 和 NN 模型表现最好,ROC 的 AUC 分别为 91.0%、92.5%和 92.3%。LR、RF、SVM 和 NN 模型的总体正确决策比例分别为 82%、76%、83%和 81%。对于 ML 算法做出决策最有帮助的特征是上颌拥挤/间隙、L1-NB(mm)、U1-NA(mm)、PFH:AFH 和 SN-MP(̊),尽管还有许多其他特征也有重要贡献。
ML 模型可以高度准确和精确地预测种族和民族多样化患者群体中的拔牙决策。拥挤、矢状和垂直特征在对 ML 决策过程影响最大的组件层次结构中都占据重要地位。