Bagheri Soroush, Hajianfar Ghasem, Sabouri Maziar, Gharibi Omid, Yazdani Babak, Aghaee Atena, Nickfarjam Ali Mohammad, Yazdani Akram, Aliasgharzadeh Akbar, Moradi Habiballah, Rahmim Arman, Zaidi Habib
Department of Medical Physics and Radiology, Allied Medical Sciences Faculty, Kashan University of Medical Sciences, Kashan, Iran.
Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, Geneva, Switzerland.
Clin Nucl Med. 2025 Aug 1;50(8):683-694. doi: 10.1097/RLU.0000000000005995. Epub 2025 Jun 17.
Thyroid diseases are the second most common hormonal disorders, necessitating accurate diagnostics. Advances in artificial intelligence and radiomics have enhanced diagnostic precision by analyzing quantitative imaging features. However, reproducibility challenges arising from factors such as the field-of-view (FOV) zooming and segmentation variability limit the clinical application of radiomic-based models.
This study focuses on evaluating the impact of segmentation and FOV zooming on the reproducibility of radiomic features and improved performance of machine learning (ML) when using reproducible features for classification of thyroid scintigraphy images into normal, diffuse goiter (DG), multinodular goiter (MNG), and thyroiditis.
A retrospective analysis was conducted on 872 thyroid scintigraphy cases from 3 centers. Radiomic feature reproducibility was assessed using the intraclass correlation coefficient (ICC), with robust features (ICC≥0.80) identified under segmentation and zooming conditions. Four ML training scenarios were implemented to train models on Center A data, including (1) all, (2) zoom-robust, (3) segmentation-robust, and (4) mutually robust features, with 3 feature selection methods and 7 classifiers. Models were validated on external data sets (centers B and C).
FOV zooming significantly reduced feature reproducibility (ICC≥0.80: 49%), while segmentation effects were minimal (ICC≥0.80: 96%). Models trained on mutually robust features outperformed those trained using all features. Boruta-MLP achieved the highest accuracy (0.71, P -value <0.001 vs. all features) in zoomed data sets, and RFE-MLP performed best (0.69, P -value <0.001 vs. all features) in the baseline data set, with Gray-Level Co-occurrence Matrix (GLCM) features frequently selected.
Utilizing robust radiomic features significantly improved the performance of ML models in thyroid disease classification, enabling more accurate and generalizable diagnostic outcomes across diverse data sets.
甲状腺疾病是第二常见的激素紊乱疾病,需要准确诊断。人工智能和放射组学的进展通过分析定量成像特征提高了诊断精度。然而,诸如视野(FOV)缩放和分割变异性等因素导致的可重复性挑战限制了基于放射组学模型的临床应用。
本研究着重评估分割和FOV缩放对放射组学特征可重复性的影响,以及在使用可重复特征将甲状腺闪烁扫描图像分类为正常、弥漫性甲状腺肿(DG)、多结节性甲状腺肿(MNG)和甲状腺炎时机器学习(ML)性能的改善情况。
对来自3个中心的872例甲状腺闪烁扫描病例进行回顾性分析。使用组内相关系数(ICC)评估放射组学特征的可重复性,在分割和缩放条件下确定稳健特征(ICC≥0.80)。实施了四种ML训练方案,以中心A的数据训练模型,包括(1)所有特征,(2)缩放稳健特征,(3)分割稳健特征,以及(4)相互稳健特征,采用3种特征选择方法和7种分类器。模型在外部数据集(中心B和C)上进行验证。
FOV缩放显著降低了特征可重复性(ICC≥0.80:49%),而分割影响最小(ICC≥0.80:96%)。使用相互稳健特征训练的模型优于使用所有特征训练的模型。在缩放数据集中,Boruta-MLP实现了最高准确率(0.71,与所有特征相比P值<0.001),在基线数据集中,RFE-MLP表现最佳(0.69,与所有特征相比P值<0.001),灰度共生矩阵(GLCM)特征经常被选中。
利用稳健的放射组学特征显著提高了ML模型在甲状腺疾病分类中的性能,能够在不同数据集上实现更准确和可推广的诊断结果。