Division of Pediatrics and Neonatal Critical Care, "A. Béclère" Medical Center, South Paris University Hospitals, APHP, Paris, France.
Division of Pediatrics and Neonatal Critical Care, "A. Béclère" Medical Center, South Paris University Hospitals, APHP, Paris, France; Physiopathology and Therapeutic Innovation Unit-INSERM U999, South Paris-Saclay University, Paris, France.
Chest. 2020 Apr;157(4):924-931. doi: 10.1016/j.chest.2019.11.013. Epub 2019 Nov 27.
The effect of different probes and operator experience on the reliability of lung ultrasound (LU) interpretation has not been investigated. We studied the effect of probes and operator experience on the interpretation reliability of LU in critically ill neonates.
This was a prospective, blind, cohort study enrolling patients with basic patterns ("B," "severe B," consolidation). Patients were scanned with microlinear (15 MHz; L15), phased-array sectorial (6-12 MHz; S7), and microconvex (8 MHz; C8) probes, in random order. Static images were acquired in high resolution, anonymized, and included in a pictorial database in random sequences. Seventeen clinicians with different LU experience were asked to blindly assess the pictorial database. Interrater agreement and interpretation reliability were analyzed. Subanalyses according to expertise and probe, and multivariate linear regression (including an "expertise × probe" interaction factor), were also performed.
The agreement tends to be lower and more heterogeneous for residents (intraclass correlation coefficient [ICC], 0.82 [95% CI, 0.74-0.9], P < .001; I, 67%, P = .04) and for fellows (ICC, 0.93 [95% CI, 0.9-0.97], P < .001; I, 69%, P = .04), especially when using nonlinear probes, compared with senior physicians (ICC, 0.95 [95% CI, 0.93-0.96], P < .001; I, 0%, P = .433). Area under the curve (AUC) values were high for all probes (L15, 0.96 [95% CI, 0.93-0.99]; C8, 0.91 [95% CI, 0.85-0.98]; S7, 0.86 [95% CI, 0.82-0.91]) and physicians (senior physicians, 0.95 [95% CI, 0.83-0.99]; fellows, 0.95 [95% CI, 0.75-0.99]; residents, 0.86 [95% CI, 0.5-0.99]). Worse reliability and higher heterogeneity were found when the evaluation was performed by residents (AUC, 0.9 [95% CI, 0.85-0.94], P < .01; I, 93.6%, P < .001) than by fellows (AUC, 0.99 [95% CI, 0.9-0.999], P < .001; I, 34.3%, P = .09) and/or by senior physicians (AUC, 0.99 [95% CI, 0.9-0.999], P < .001; I, 18%, P = .236). The "expertise × probe" interaction factor was associated with lower ICC (standardized regression coefficient β, -0.69; P < .0001; adjusted R, 0.99) and AUC (standardized regression coefficient β, -0.76; P < .0001; adjusted R, 0.98).
LU interpretation in neonates shows good interrater agreement and reliability, irrespective of the probe and rater expertise. The use of nonlinear probes by novice operators is associated with the lowest agreement and reliability.
不同的探头和操作人员经验对肺部超声(Lung Ultrasound,LU)解读的可靠性的影响尚未得到研究。我们研究了探头和操作人员经验对危重新生儿 LU 解读可靠性的影响。
这是一项前瞻性、盲法、队列研究,纳入了具有基本模式(“B”、“严重 B”、实变)的患者。使用微线性(15 MHz;L15)、相控阵扇区(6-12 MHz;S7)和微凸(8 MHz;C8)探头以随机顺序对患者进行扫描。静态图像以高分辨率获取,匿名,并以随机序列包含在图像数据库中。17 名具有不同 LU 经验的临床医生被要求对图像数据库进行盲法评估。分析了组内一致性和解读可靠性。还进行了根据专业知识和探头的亚组分析,以及多元线性回归(包括“专业知识×探头”交互因子)。
对于住院医师(组内相关系数[Intraclass Correlation Coefficient,ICC],0.82 [95%置信区间,0.74-0.9],P<0.001;I 型方差,67%,P=0.04)和研究员(ICC,0.93 [95%置信区间,0.9-0.97],P<0.001;I 型方差,69%,P=0.04),特别是当使用非线性探头时,与高级医师相比,其一致性较低且更具异质性(ICC,0.95 [95%置信区间,0.93-0.96],P<0.001;I 型方差,0%,P=0.433)。所有探头(L15,0.96 [95%置信区间,0.93-0.99];C8,0.91 [95%置信区间,0.85-0.98];S7,0.86 [95%置信区间,0.82-0.91])和医师(高级医师,0.95 [95%置信区间,0.83-0.99];研究员,0.95 [95%置信区间,0.75-0.99];住院医师,0.86 [95%置信区间,0.5-0.99])的曲线下面积(Area Under the Curve,AUC)值均较高。当由住院医师进行评估时,发现可靠性较低且异质性较高(AUC,0.9 [95%置信区间,0.85-0.94],P<0.01;I 型方差,93.6%,P<0.001),而不是由研究员(AUC,0.99 [95%置信区间,0.9-0.999],P<0.001;I 型方差,34.3%,P=0.09)或高级医师(AUC,0.99 [95%置信区间,0.9-0.999],P<0.001;I 型方差,18%,P=0.236)进行评估时。“专业知识×探头”交互因子与较低的 ICC(标准化回归系数β,-0.69;P<0.0001;调整后的 R,0.99)和 AUC(标准化回归系数β,-0.76;P<0.0001;调整后的 R,0.98)相关。
新生儿 LU 解读具有良好的组内一致性和可靠性,与探头和评估者的专业知识无关。新手操作人员使用非线性探头与最低的一致性和可靠性相关。