IEEE Trans Med Imaging. 2022 Apr;41(4):826-835. doi: 10.1109/TMI.2021.3124217. Epub 2022 Apr 1.
Precise segmentation of teeth from intra-oral scanner images is an essential task in computer-aided orthodontic surgical planning. The state-of-the-art deep learning-based methods often simply concatenate the raw geometric attributes (i.e., coordinates and normal vectors) of mesh cells to train a single-stream network for automatic intra-oral scanner image segmentation. However, since different raw attributes reveal completely different geometric information, the naive concatenation of different raw attributes at the (low-level) input stage may bring unnecessary confusion in describing and differentiating between mesh cells, thus hampering the learning of high-level geometric representations for the segmentation task. To address this issue, we design a two-stream graph convolutional network (i.e., TSGCN), which can effectively handle inter-view confusion between different raw attributes to more effectively fuse their complementary information and learn discriminative multi-view geometric representations. Specifically, our TSGCN adopts two input-specific graph-learning streams to extract complementary high-level geometric representations from coordinates and normal vectors, respectively. Then, these single-view representations are further fused by a self-attention module to adaptively balance the contributions of different views in learning more discriminative multi-view representations for accurate and fully automatic tooth segmentation. We have evaluated our TSGCN on a real-patient dataset of dental (mesh) models acquired by 3D intraoral scanners. Experimental results show that our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
从口腔内扫描仪图像中精确分割牙齿是计算机辅助正畸手术规划中的一项基本任务。基于深度学习的最新方法通常只是简单地将网格单元的原始几何属性(即坐标和法向量)连接起来,以训练用于自动口腔内扫描仪图像分割的单流网络。然而,由于不同的原始属性揭示了完全不同的几何信息,因此在(低层次)输入阶段简单地连接不同的原始属性可能会在描述和区分网格单元时带来不必要的混淆,从而阻碍分割任务的高级几何表示的学习。为了解决这个问题,我们设计了一种双流图卷积网络(即 TSGCN),它可以有效地处理不同原始属性之间的视图间混淆,以更有效地融合它们的互补信息,并学习有区别的多视图几何表示。具体来说,我们的 TSGCN 采用两个输入特定的图学习流,分别从坐标和法向量中提取互补的高级几何表示。然后,这些单视图表示通过自注意力模块进一步融合,自适应地平衡不同视图在学习更具判别力的多视图表示方面的贡献,以实现准确和全自动的牙齿分割。我们在由 3D 口腔内扫描仪获取的真实患者牙齿(网格)模型数据集上评估了我们的 TSGCN。实验结果表明,我们的 TSGCN 在 3D 牙齿(表面)分割方面明显优于最新方法。