Zhang R, Jin L, Chen Q M, Ding T T, Zhang Q Y, Chen Y W, Tian X, Cao Y Y, Chen X Y, Zhu F D
Center of Information, Stomatology Hospital, School of Stomatology, Zhejiang University School of Medicine & Clinical Research Center for Oral Diseases of Zhejiang Province & Key Laboratory of Oral Biomedical Research of Zhejiang Province & Cancer Center of Zhejiang University & Engineering Research Center of Oral Biomaterials and Devices of Zhejiang Province, Hangzhou 310005, China.
Oral Medicine Center, Stomatology Hospital, School of Stomatology, Zhejiang University School of Medicine & Clinical Research Center for Oral Diseases of Zhejiang Province & Key Laboratory of Oral Biomedical Research of Zhejiang Province & Cancer Center of Zhejiang University & Engineering Research Center of Oral Biomaterials and Devices of Zhejiang Province, Hangzhou 310005, China.
Zhonghua Kou Qiang Yi Xue Za Zhi. 2025 Mar 9;60(3):239-247. doi: 10.3760/cma.j.cn112144-20241210-00467.
To develop PixelSIFT-UNet, a novel semantic segmentation model that integrates deep learning with scale-invariant feature transform (SIFT) algorithm to improve the segmentation accuracy of oral mucosal lesions. This investigation utilized 838 standard clinical white light images of oral mucosal diseases acquired from January 2020 to December 2022 at the Stomatology Hospital Zhejiang University School of Medicine. Randomization was achieved through Python's random.seed function implementation. The random sample function was subsequently applied for sampling distribution. The dataset was stratified into three subsets with a 6∶2∶2 ratio: training (=506), validation (=166), and testing (=166). Lesion boundaries were annotated using Labelme software, and a PixelSIFT-UNet-based deep learning model was developed with VGG-16 and ResNet-50 backbone networks. Model parameters were optimized using the validation set, and performance metrics [including Dice coefficient, mean intersection over union (mIoU), mean pixel accuracy (mPA), and Precision] were assessed on the test set. The model's performance was benchmarked against conventional semantic segmentation frameworks (U-Net and PSPNet). The developed PixelSIFT-UNet model could achieve precise segmentation of three common oral mucosal lesions: oral lichen planus, oral leukoplakia, and oral submucous fibrosis. Utilizing VGG-16 as the backbone network, the model achieved Dice coefficient, mIoU, mPA, and Precision values of 0.642, 0.699, 0.836, and 0.792, respectively. Implementation with ResNet-50 backbone network yielded metrics of 0.668, 0.733, 0.872 and 0.817, demonstrating significant improvements across all performance indicators compared to conventional U-Net model (relevant metrics: 0.662, 0.717, 0.861 and 0.809) and PSPNet model (relevant metrics: 0.671, 0.721, 0.858 and 0.813). The proposed PixelSIFT-UNet architecture demonstrates superior performance in oral mucosal lesion segmentation tasks, surpassing conventional semantic segmentation models and providing robust quantitative improvements in segmentation accuracy.
为了开发PixelSIFT-UNet,一种将深度学习与尺度不变特征变换(SIFT)算法相结合的新型语义分割模型,以提高口腔黏膜病变的分割准确性。本研究使用了2020年1月至2022年12月在浙江大学医学院附属口腔医院采集的838张口腔黏膜疾病标准临床白光图像。通过Python的随机种子函数实现随机化。随后应用随机抽样函数进行抽样分布。数据集按6∶2∶2的比例分层为三个子集:训练集(=506)、验证集(=166)和测试集(=166)。使用Labelme软件标注病变边界,并使用VGG-16和ResNet-50骨干网络开发基于PixelSIFT-UNet的深度学习模型。使用验证集优化模型参数,并在测试集上评估性能指标[包括Dice系数、平均交并比(mIoU)、平均像素准确率(mPA)和精确率]。将该模型的性能与传统语义分割框架(U-Net和PSPNet)进行基准测试。所开发的PixelSIFT-UNet模型能够对三种常见的口腔黏膜病变进行精确分割:口腔扁平苔藓、口腔白斑和口腔黏膜下纤维化。以VGG-16作为骨干网络,该模型的Dice系数、mIoU、mPA和精确率值分别为0.642、0.699、0.836和0.792。使用ResNet-50骨干网络实现的指标为0.668、0.733、0.872和0.817,与传统U-Net模型(相关指标:0.662、0.717、0.861和0.809)和PSPNet模型(相关指标:0.671、0.721、0.858和0.813)相比,在所有性能指标上均有显著提高。所提出的PixelSIFT-UNet架构在口腔黏膜病变分割任务中表现出卓越的性能,超越了传统语义分割模型,并在分割准确性方面提供了强大的定量改进。