Couzins Michael, Forbes Stuart, Vigneswaran Ganesh, Mitra Indu, Rutherford Elizabeth E
University Hospital Southampton NHS Foundation Trust, Southampton, UK.
Chelsea and Westminster NHS Hospital, London, UK.
Ultrasound. 2021 May;29(2):100-105. doi: 10.1177/1742271X20971323. Epub 2020 Nov 16.
U-score ultrasound classification (graded U1-U5) is widely used to grade thyroid nodules based on benign and malignant sonographic features. It is well established that ultrasound is an operator-dependent imaging modality and thus more susceptible to subjective variances between operators when using imaging-based scoring systems. We aimed to assess whether there is any intra- or interobserver variability when U-scoring thyroid nodules and whether previous thyroid ultrasound experience has an effect on this variability.
A total of 14 ultrasound operators were identified (five experienced thyroid operators, five with intermediate experience and four with no experience) and were asked to U-score images from 20 thyroid cases shown as a single projection, with and without Doppler flow. The cases were subsequently rescored by the 14 operators after six weeks. The first and second round U-scores for the three operator groups were then analysed using Fleiss' kappa to assess interobserver variability and Cochran's Q test to determine any intraobserver variability.
We found no significant interobserver variability on combined assessment of all operators with fair agreement in round 1 (Fleiss' kappa = 0.30, <0.0001) and slight agreement in round 2 (Fleiss' kappa = 0.19, < 0.0001). Cochran's Q test revealed no significant intraobserver variability in all 14 operators between round 1 and round 2 (all >0.05).
We found no statistically significant inter- or intraobserver variability in the U-scoring of thyroid nodules between all participants reinforcing the validity of this scoring method in clinical practice, allaying concerns regarding potential subjective biases in reporting.
U 评分超声分类(分级为 U1-U5)广泛用于根据甲状腺结节的良恶性超声特征对其进行分级。众所周知,超声是一种依赖操作者的成像方式,因此在使用基于成像的评分系统时,更容易受到操作者之间主观差异的影响。我们旨在评估对甲状腺结节进行 U 评分时是否存在观察者内或观察者间的变异性,以及既往甲状腺超声检查经验是否会对这种变异性产生影响。
共确定了 14 名超声操作者(5 名有经验的甲状腺操作者、5 名经验中等的操作者和 4 名无经验的操作者),要求他们对 20 例甲状腺病例的图像进行 U 评分,图像以单一投影形式呈现,有或没有多普勒血流。六周后,这 14 名操作者对这些病例再次进行评分。然后使用 Fleiss' kappa 分析三组操作者第一轮和第二轮的 U 评分,以评估观察者间的变异性,并使用 Cochran's Q 检验确定任何观察者内的变异性。
在第一轮联合评估所有操作者时,我们发现观察者间无显著变异性,一致性一般(Fleiss' kappa = 0.30,<0.0001),在第二轮中一致性轻微(Fleiss' kappa = 0.19,<0.0001)。Cochran's Q 检验显示,在第一轮和第二轮之间,所有 14 名操作者均无显著的观察者内变异性(均>0.05)。
我们发现所有参与者在甲状腺结节 U 评分中观察者间和观察者内均无统计学显著变异性,这加强了该评分方法在临床实践中的有效性,消除了对报告中潜在主观偏差的担忧。