Gao Fei, Tang Yulong
Department of Stomatology, General Hospital of PLA Northern Theater Command, Shenyang, 110002, Liaoning, China.
Sci Rep. 2025 Jul 12;15(1):25205. doi: 10.1038/s41598-025-06229-w.
In orthodontics and maxillofacial surgery, accurate cephalometric analysis and treatment outcome prediction are critical for clinical decision-making. Traditional approaches rely on manual landmark identification, which is time-consuming and subject to inter-observer variability, while existing automated methods typically utilize single imaging modalities with limited accuracy. This paper presents DeepFuse, a novel multi-modal deep learning framework that integrates information from lateral cephalograms, CBCT volumes, and digital dental models to simultaneously perform landmark detection and treatment outcome prediction. The framework employs modality-specific encoders, an attention-guided fusion mechanism, and dual-task decoders to leverage complementary information across imaging techniques. Extensive experiments on three clinical datasets demonstrate that DeepFuse achieves a mean radial error of 1.21 mm for landmark detection, representing a 13% improvement over state-of-the-art methods, with a clinical acceptability rate of 92.4% at the 2 mm threshold. For treatment outcome prediction, the framework attains an overall accuracy of 85.6%, significantly outperforming both conventional prediction models and experienced clinicians. The proposed approach enhances diagnostic precision and treatment planning while providing interpretable visualization of decision factors, demonstrating significant potential for clinical integration in orthodontic and maxillofacial practice.
在正畸学和颌面外科中,精确的头影测量分析和治疗结果预测对于临床决策至关重要。传统方法依赖于手动识别标志点,既耗时又存在观察者间的差异,而现有的自动化方法通常使用单一成像模式,准确性有限。本文提出了DeepFuse,这是一种新颖的多模态深度学习框架,它整合了侧位头影图、CBCT容积和数字牙科模型的信息,以同时进行标志点检测和治疗结果预测。该框架采用特定模态的编码器、注意力引导融合机制和双任务解码器,以利用跨成像技术的互补信息。在三个临床数据集上进行的大量实验表明,DeepFuse在标志点检测方面实现了1.21毫米的平均径向误差,比现有最先进方法提高了13%,在2毫米阈值下的临床可接受率为92.4%。对于治疗结果预测,该框架的总体准确率达到85.6%,显著优于传统预测模型和经验丰富的临床医生。所提出的方法提高了诊断精度和治疗计划,同时提供了决策因素的可解释可视化,显示出在正畸和颌面实践中临床整合的巨大潜力。