Wang S F, Xie X J, Zhang L, Chang S, Zuo F F, Wang Y J, Bai Y X
Department of Orthodontics, Capital Medical University School of Stomatology, Beijing 100050, China.
LargeV Instrument Corp., Ltd, Beijing 100084, China.
Zhonghua Kou Qiang Yi Xue Za Zhi. 2023 Jun 9;58(6):561-568. doi: 10.3760/cma.j.cn112144-20230305-00070.
To develop a multi-classification orthodontic image recognition system using the SqueezeNet deep learning model for automatic classification of orthodontic image data. A total of 35 000 clinical orthodontic images were collected in the Department of Orthodontics, Capital Medical University School of Stomatology, from October to November 2020 and June to July 2021. The images were from 490 orthodontic patients with a male-to-female ratio of 49∶51 and the age range of 4 to 45 years. After data cleaning based on inclusion and exclusion criteria, the final image dataset included 17 453 face images (frontal, smiling, 90° right, 90° left, 45° right, and 45° left), 8 026 intraoral images [frontal occlusion, right occlusion, left occlusion, upper occlusal view (original and flipped), lower occlusal view (original and flipped) and coverage of occlusal relationship], 4 115 X-ray images [lateral skull X-ray from the left side, lateral skull X-ray from the right side, frontal skull X-ray, cone-beam CT (CBCT), and wrist bone X-ray] and 684 other non-orthodontic images. A labeling team composed of orthodontic doctoral students, associate professors, and professors used image labeling tools to classify the orthodontic images into 20 categories, including 6 face image categories, 8 intraoral image categories, 5 X-ray image categories, and other images. The data for each label were randomly divided into training, validation, and testing sets in an 8∶1∶1 ratio using the random function in the Python programming language. The improved SqueezeNet deep learning model was used for training, and 13 000 natural images from the ImageNet open-source dataset were used as additional non-orthodontic images for algorithm optimization of anomaly data processing. A multi-classification orthodontic image recognition system based on deep learning models was constructed. The accuracy of the orthodontic image classification was evaluated using precision, recall, F1 score, and confusion matrix based on the prediction results of the test set. The reliability of the model's image classification judgment logic was verified using the gradient-weighted class activation mapping (Grad-CAM) method to generate heat maps. After data cleaning and labeling, a total of 30 278 orthodontic images were included in the dataset. The test set classification results showed that the precision, recall, and F1 scores of most classification labels were 100%, with only 5 misclassified images out of 3 047, resulting in a system accuracy of 99.84%(3 042/3 047). The precision of anomaly data processing was 100% (10 500/10 500). The heat map showed that the judgment basis of the SqueezeNet deep learning model in the image classification process was basically consistent with that of humans. This study developed a multi-classification orthodontic image recognition system for automatic classification of 20 types of orthodontic images based on the improved SqueezeNet deep learning model. The system exhibitted good accuracy in orthodontic image classification.
利用SqueezeNet深度学习模型开发一种多分类正畸图像识别系统,用于正畸图像数据的自动分类。2020年10月至11月以及2021年6月至7月期间,首都医科大学口腔医学院正畸科共收集了35000张临床正畸图像。这些图像来自490名正畸患者,男女比例为49∶51,年龄范围为4至45岁。根据纳入和排除标准进行数据清理后,最终的图像数据集包括17453张面部图像(正面、微笑、右侧90°、左侧90°、右侧45°和左侧45°)、8026张口腔内图像[正面咬合、右侧咬合、左侧咬合、上颌咬合视图(原始和翻转)、下颌咬合视图(原始和翻转)以及咬合关系覆盖情况]、4115张X线图像[左侧头颅侧位X线、右侧头颅侧位X线、头颅正位X线、锥形束CT(CBCT)和腕骨X线]以及684张其他非正畸图像。一个由正畸博士生、副教授和教授组成的标注团队使用图像标注工具将正畸图像分为20类,包括6类面部图像、8类口腔内图像、5类X线图像和其他图像。使用Python编程语言中的随机函数将每个标签的数据以8∶1∶1的比例随机分为训练集、验证集和测试集。使用改进的SqueezeNet深度学习模型进行训练,并使用来自ImageNet开源数据集的13000张自然图像作为额外的非正畸图像,用于异常数据处理的算法优化。构建了基于深度学习模型的多分类正畸图像识别系统。基于测试集的预测结果,使用精确率、召回率、F1分数和混淆矩阵评估正畸图像分类的准确性。使用梯度加权类激活映射(Grad-CAM)方法生成热图,验证模型图像分类判断逻辑 的可靠性。经过数据清理和标注后,数据集中共包含30278张正畸图像。测试集分类结果显示,大多数分类标签的精确率、召回率和F1分数均为100%,在3047张图像中只有5张误分类图像,系统准确率为99.84%(3042/3047)。异常数据处理的精确率为100%(10500/10500)。热图显示,SqueezeNet深度学习模型在图像分类过程中的判断依据与人类基本一致。本研究基于改进的SqueezeNet深度学习模型开发了一种多分类正畸图像识别系统,用于自动分类20种正畸图像。该系统在正畸图像分类方面表现出良好的准确性。