Kaba Esat, Solak Merve, Varlık Ayşenur Topçu, Çubukçu Yusuf, Sağır Lütfullah, Sünnetci Kubilay Muhammed, Alkan Ahmet, Gündoğdu Hasan, Çeliker Fatma Beyazal, Beyazal Mehmet
Recep Tayyip Erdogan University, Department of Radiology, Rize.
Osmaniye Korkut Ata University, Department of Electrical and Electronics Engineering, Osmaniye, Kahramanmaraş Sütçü İmam University, Department of Electrical and Electronics Engineering, Kahramanmaraş.
Med Ultrason. 2025 Mar 2;27(1):26-31. doi: 10.11152/mu-4432. Epub 2024 Sep 4.
This study aims to use deep learning (DL) to classify thyroid nodules as benign and malignant with ultrasonography (US). In addition, this study investigates the impact of DL on the diagnostic success of radiologists with different experiences. Material and methods: This study included 576 US images of thyroid nodules. The dataset was divided into 80% training and 20% test sets. Four radiologists with different levels of experience classified the images in the test set as benign-malignant. A DL model was then trained with the train set and predicted benign-malignant for the test set. Then, the output of the DL model for each nodule in the test set was presented to 4 radiologists, who were asked to make a benign-malignant classification again considering these DL results.
The accuracy of the DL model was 0.9391. The accuracy for junior resident (JR) 1, JR 2, senior resident (SR), and senior radiologist (Srad) before DL-assisting were 0.7043, 0.7826, 0.8435, and 0.8522 respectively. The accuracy in DL-assisted classifications was 0.9130, 0.8696, 0.9304, and 0.9043 for JR 1, JR2, SR, and Srad, respectively. DL assistance changed the decisions of less experienced radiologists more than more experienced radiologists. Conclusion: The DL model has superior accuracy in classifying thyroid nodules as benign-malignant with US images than radiologists with different levels of experience. Additionally, all radiologists, and most notably less experienced radiology residents, increased their accuracy in DL-assisted predictions.
本研究旨在利用深度学习(DL)通过超声(US)对甲状腺结节进行良恶性分类。此外,本研究还调查了DL对不同经验放射科医生诊断成功率的影响。
本研究纳入了576张甲状腺结节的超声图像。数据集分为80%的训练集和20%的测试集。四名经验水平不同的放射科医生将测试集中的图像分类为良性或恶性。然后用训练集训练一个DL模型,并对测试集进行良恶性预测。接着,将测试集中每个结节的DL模型输出结果呈现给4名放射科医生,要求他们在考虑这些DL结果的情况下再次进行良恶性分类。
DL模型的准确率为0.9391。在DL辅助之前,初级住院医生(JR)1、JR 2、高级住院医生(SR)和资深放射科医生(Srad)的准确率分别为0.7043、0.7826、0.8435和0.8522。在DL辅助分类中,JR 1、JR2、SR和Srad的准确率分别为0.9130、0.8696、0.9304和0.9043。DL辅助对经验较少的放射科医生决策的改变比对经验丰富的放射科医生更大。
与不同经验水平的放射科医生相比,DL模型在利用超声图像将甲状腺结节分类为良恶性方面具有更高的准确率。此外,所有放射科医生,尤其是经验较少的放射科住院医生,在DL辅助预测中的准确率都有所提高。