Rtam Nabil
Ultrasound Department, Yeovil District Hospital, Somerset NHS Foundation Trust, Yeovil, UK.
Ultrasound. 2024 May;32(2):76-84. doi: 10.1177/1742271X231215500. Epub 2023 Dec 28.
The British Thyroid Association Ultrasound-classification is a risk stratification model which grades thyroid nodules in U2-5 based on their sonographic appearance. Existence of variability between the ultrasound operators when U-scoring is reported in the literature with some evidence found in the author's department. The aim of this study was to investigate whether there is significant disagreement in the department and identify potential reasons for variability.
Eight operators, radiologists and sonographers, were recruited to grade 33 TNs and answer a tick box questionnaire using the British Thyroid Association lexicon. The inter-operator variability for the U-categories, indication for fine-needle aspiration biopsy and ultrasound features was assessed using Fleiss' kappa and Gwet-AC1. The operators' accuracy was measured against the most experienced operator in the department using Cohen's kappa and percentage agreement.
Fair agreement (Fleiss' = 0.21) was obtained between the participants when U-scoring (U2-5). Fair-to-moderate agreement was noted between sonographers ( = 0.40). Significant variability was demonstrated between radiologists ( > 0.05). Indication for fine-needle aspiration biopsy reached fair to almost substantial agreement (radiologists' AC1 = 0.34, sonographers' AC1 = 0.58, overall AC1 = 0.41). No significant variability measured for echogenicity ( = 0.29), composition ( = 0.33), shape ( = 0.58), margin ( = 0.45), halo ( = 0.34) and vascularity ( = 0.44). Accuracy reached fair agreement (mean Cohen's = 0.29) and moderate agreement (mean AC1 = 0.53) for the U-categories and fine-needle aspiration biopsy, respectively. Radiologists demonstrated lower accuracy.
No significant inter-rater variability in U-scoring or recommending fine-needle aspiration biopsy was demonstrated between all the operators in the department. Radiologists showed significant variability in U-scoring and lower accuracy. Reliability and accuracy could be improved by addressing those problematic categories and features identified with this study.
英国甲状腺协会超声分类是一种风险分层模型,它根据甲状腺结节的超声表现对U2 - 5级别的甲状腺结节进行分级。文献报道在进行U评分时超声操作者之间存在差异,作者所在科室也发现了一些证据。本研究的目的是调查科室内部是否存在显著分歧,并确定差异的潜在原因。
招募了8名操作者,包括放射科医生和超声检查技师,对33个甲状腺结节进行分级,并使用英国甲状腺协会词汇表回答一份勾选问卷。使用Fleiss' kappa和Gwet - AC1评估U类别、细针穿刺活检指征和超声特征的操作者间差异。将操作者的准确性与科室中经验最丰富的操作者进行比较,使用Cohen's kappa和百分比一致性进行测量。
在进行U评分(U2 - 5)时,参与者之间达成了一般一致性(Fleiss' κ = 0.21)。超声检查技师之间的一致性为中等(κ = 0.40)。放射科医生之间存在显著差异(p > 0.05)。细针穿刺活检指征达成了中等至几乎完全一致(放射科医生的AC1 = 0.34,超声检查技师的AC1 = 0.58,总体AC1 = 0.41)。对于回声(κ = 0.29)、成分(κ = 0.33)、形状(κ = 0.58)、边缘(κ = 0.45)、晕环(κ = 0.34)和血管分布(κ = 0.44),未测量到显著差异。对于U类别和细针穿刺活检指征,准确性分别达成了一般一致性(平均Cohen's κ = 0.29)和中等一致性(平均AC1 = 0.53)。放射科医生的准确性较低。
科室中所有操作者在U评分或推荐细针穿刺活检方面未表现出显著的评分者间差异。放射科医生在U评分方面表现出显著差异且准确性较低。通过解决本研究中确定的那些有问题的类别和特征,可以提高可靠性和准确性。