Suppr超能文献

基于深度学习的牙体结构分割模型的基准测试。

Benchmarking Deep Learning Models for Tooth Structure Segmentation.

机构信息

Department of Oral Diagnostics, Digital Health and Health Services Research, Charité-Universitätsmedizin, Berlin, Germany.

ITU/WHO Focus Group on AI for Health, Topic Group Dental Diagnostics and Digital Dentistry, Geneva, Switzerland.

出版信息

J Dent Res. 2022 Oct;101(11):1343-1349. doi: 10.1177/00220345221100169. Epub 2022 Jun 9.

Abstract

A wide range of deep learning (DL) architectures with varying depths are available, with developers usually choosing one or a few of them for their specific task in a nonsystematic way. Benchmarking (i.e., the systematic comparison of state-of-the art architectures on a specific task) may provide guidance in the model development process and may allow developers to make better decisions. However, comprehensive benchmarking has not been performed in dentistry yet. We aimed to benchmark a range of architecture designs for 1 specific, exemplary case: tooth structure segmentation on dental bitewing radiographs. We built 72 models for tooth structure (enamel, dentin, pulp, fillings, crowns) segmentation by combining 6 different DL network architectures (U-Net, U-Net++, Feature Pyramid Networks, LinkNet, Pyramid Scene Parsing Network, Mask Attention Network) with 12 encoders from 3 different encoder families (ResNet, VGG, DenseNet) of varying depth (e.g., VGG13, VGG16, VGG19). On each model design, 3 initialization strategies (ImageNet, CheXpert, random initialization) were applied, resulting overall into 216 trained models, which were trained up to 200 epochs with the Adam optimizer (learning rate = 0.0001) and a batch size of 32. Our data set consisted of 1,625 human-annotated dental bitewing radiographs. We used a 5-fold cross-validation scheme and quantified model performances primarily by the F1-score. Initialization with ImageNet or CheXpert weights significantly outperformed random initialization ( < 0.05). Deeper and more complex models did not necessarily perform better than less complex alternatives. VGG-based models were more robust across model configurations, while more complex models (e.g., from the ResNet family) achieved peak performances. In conclusion, initializing models with pretrained weights may be recommended when training models for dental radiographic analysis. Less complex model architectures may be competitive alternatives if computational resources and training time are restricting factors. Models developed and found superior on nondental data sets may not show this behavior for dental domain-specific tasks.

摘要

有多种具有不同深度的深度学习 (DL) 架构可供选择,开发者通常会根据自己的特定任务非系统性地选择其中一种或几种。基准测试(即在特定任务上对最先进架构的系统比较)可以在模型开发过程中提供指导,并使开发者能够做出更好的决策。然而,在牙科领域尚未进行全面的基准测试。我们的目标是针对一个特定的示例案例(牙体结构在口腔 X 光片上的分割)来对一系列架构设计进行基准测试。我们通过结合 6 种不同的 DL 网络架构(U-Net、U-Net++、特征金字塔网络、LinkNet、金字塔场景解析网络、掩模注意力网络)和来自 3 种不同编码器家族(ResNet、VGG、DenseNet)的 12 个编码器来构建 72 个牙齿结构(釉质、牙本质、牙髓、填充物、牙冠)分割模型,这些模型的深度不同(例如,VGG13、VGG16、VGG19)。在每个模型设计中,我们应用了 3 种初始化策略(ImageNet、CheXpert、随机初始化),最终得到了 216 个经过训练的模型,这些模型使用 Adam 优化器(学习率 = 0.0001)和批量大小为 32 进行了 200 个 epoch 的训练。我们的数据集中包含 1625 张人类标注的口腔 X 光片。我们使用了 5 折交叉验证方案,并主要通过 F1 分数来量化模型性能。使用 ImageNet 或 CheXpert 权重初始化的模型明显优于随机初始化的模型(<0.05)。更深层和更复杂的模型不一定比不太复杂的模型表现更好。基于 VGG 的模型在各种模型配置下更稳健,而更复杂的模型(例如,来自 ResNet 家族)则达到了性能峰值。总之,在训练用于口腔 X 光分析的模型时,建议使用预训练权重初始化模型。如果计算资源和训练时间受到限制,那么更简单的模型架构可能是更有竞争力的替代方案。在非牙科数据集上开发并表现优异的模型可能不会在特定于牙科领域的任务中表现出这种行为。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9798/9516600/025f8de357a0/10.1177_00220345221100169-fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验