Trimboli P, Colombo A, Gamarra E, Ruinelli L, Leoncini A
Clinic for Endocrinology and Diabetology, Thyroid Unit, Ente Ospedaliero Cantonale (EOC), Bellinzona, Switzerland.
Faculty of Biomedical Sciences, Università Della Svizzera Italiana, Lugano, Switzerland.
J Endocrinol Invest. 2025 Apr;48(4):877-883. doi: 10.1007/s40618-024-02518-9. Epub 2024 Dec 18.
Ultrasound (US) evaluation is recognized as pivotal in assessing the risk of malignancy (RoM) of thyroid nodules (TNs). Recently, various US-based risk-classification systems (Thyroid Imaging and Reporting Data Systems [TIRADSs] have been developed. An important ongoing project concerns the creation of an international system (I-TIRADS) using unique terminology. Since online tool allow clinicians and patients to stratify the RoM of any TN, the role of computer scientist (CS) should be relevant. This study explored the performance of CS in assessing TNs across the TIRADS categories.
The most diffused TIRADSs (i.e., ACR, EU, and K) were considered. Three-hundred scenarios were created. A CS was asked to assess the 300 TNs according to ACR-, EU-, and K-TIRADS. These data were compared with that of clinicians. The inter-observer agreement was estimated with Cohen kappa (κ). Word-cloud plots were used to graph the US descriptors with disagreement.
The correspondence of the CS's assessment with the physicians was 100%, 81%, and 43%, using ACR-, EU-, and K-TIRADS, respectively. The CS was unable to classify 19/100 TNs according to EU-TIRADS and 15/100 TNs according to K-TIRADS. The inter-observer agreement between CS and physicians was excellent for ACR-TIRADS (κ = 1), moderate for EU-TIRADS (κ = 0.56), and fair for K-TIRADS (κ = 0.22). Among the non-concordant cases, 16/22 descriptors for EU-TIRADS and 18/18 descriptors for K-TIRADS were found.
CSs are confident with the ACR-TIRADS lexicon and structure while not with EU- and K-TIRADS, probably because they are pattern-based systems requiring medical training.
超声(US)评估在评估甲状腺结节(TN)的恶性风险(RoM)中被认为至关重要。最近,已开发出各种基于超声的风险分类系统(甲状腺影像报告和数据系统 [TIRADS])。一个重要的正在进行的项目涉及使用独特术语创建国际系统(I-TIRADS)。由于在线工具使临床医生和患者能够对任何TN的RoM进行分层,计算机科学家(CS)的作用应该是相关的。本研究探讨了CS在评估不同TIRADS类别的TN方面的表现。
考虑了最广泛使用的TIRADS(即ACR、欧盟和K)。创建了300个病例场景。要求一名CS根据ACR-TIRADS、欧盟-TIRADS和K-TIRADS对这300个TN进行评估。将这些数据与临床医生的数据进行比较。用Cohen卡方(κ)估计观察者间的一致性。使用词云图来绘制存在分歧的超声描述符。
使用ACR-TIRADS、欧盟-TIRADS和K-TIRADS时,CS评估与医生评估的一致性分别为100%、81%和43%。根据欧盟-TIRADS,CS无法对19/100个TN进行分类;根据K-TIRADS,无法对15/100个TN进行分类。CS与医生之间的观察者间一致性,对于ACR-TIRADS为优(κ = 1),对于欧盟-TIRADS为中等(κ = 0.56),对于K-TIRADS为一般(κ = 0.22)。在不一致的病例中,发现了欧盟-TIRADS的16/22个描述符和K-TIRADS的18/18个描述符。
CS对ACR-TIRADS的词汇和结构有信心,而对欧盟和K-TIRADS则不然,可能是因为它们是基于模式的系统,需要医学培训。