Boero Enrico, Gargani Luna, Schreiber Annia, Rovida Serena, Martinelli Giampaolo, Maggiore Salvatore Maurizio, Urso Felice, Camporesi Anna, Tullio Annarita, Lombardi Fiorella Anna, Cammarota Gianmaria, Biasucci Daniele Guerino, Bignami Elena Giovanna, Deana Cristian, Volpicelli Giovanni, Livigni Sergio, Vetrugno Luigi
Department of Anaesthesia and Intensive Care Unit, San Giovanni Bosco Hospital, Turin, Italy.
Department of Surgical, Medical and Molecular Pathology and Critical Care Medicine, University of Pisa, Pisa, Italy.
J Anesth Analg Crit Care. 2024 Jul 31;4(1):50. doi: 10.1186/s44158-024-00187-x.
Lung ultrasonography (LUS) is a non-invasive imaging method used to diagnose and monitor conditions such as pulmonary edema, pneumonia, and pneumothorax. It is precious where other imaging techniques like CT scan or chest X-rays are of limited access, especially in low- and middle-income countries with reduced resources. Furthermore, LUS reduces radiation exposure and its related blood cancer adverse events, which is particularly relevant in children and young subjects. The score obtained with LUS allows semi-quantification of regional loss of aeration, and it can provide a valuable and reliable assessment of the severity of most respiratory diseases. However, inter-observer reliability of the score has never been systematically assessed. This study aims to assess experienced LUS operators' agreement on a sample of video clips showing predefined findings.
Twenty-five anonymized video clips comprehensively depicting the different values of LUS score were shown to renowned LUS experts blinded to patients' clinical data and the study's aims using an online form. Clips were acquired from five different ultrasound machines. Fleiss-Cohen weighted kappa was used to evaluate experts' agreement.
Over a period of 3 months, 20 experienced operators completed the assessment. Most worked in the ICU (10), ED (6), HDU (2), cardiology ward (1), or obstetric/gynecology department (1). The proportional LUS score mean was 15.3 (SD 1.6). Inter-rater agreement varied: 6 clips had full agreement, 3 had 19 out of 20 raters agreeing, and 3 had 18 agreeing, while the remaining 13 had 17 or fewer people agreeing on the assigned score. Scores 0 and score 3 were more reproducible than scores 1 and 2. Fleiss' Kappa for overall answers was 0.87 (95% CI 0.815-0.931, p < 0.001).
The inter-rater agreement between experienced LUS operators is very high, although not perfect. The strong agreement and the small variance enable us to say that a 20% tolerance around a measured value of a LUS score is a reliable estimate of the patient's true LUS score, resulting in reduced variability in score interpretation and greater confidence in its clinical use.
肺部超声检查(LUS)是一种用于诊断和监测肺水肿、肺炎和气胸等病症的非侵入性成像方法。在CT扫描或胸部X光等其他成像技术难以获取的情况下,它尤为珍贵,特别是在资源有限的低收入和中等收入国家。此外,LUS减少了辐射暴露及其相关的血癌不良事件,这在儿童和年轻受试者中尤为重要。通过LUS获得的分数可以对局部通气丧失进行半定量,并且可以为大多数呼吸系统疾病的严重程度提供有价值且可靠的评估。然而,评分的观察者间可靠性从未得到系统评估。本研究旨在评估经验丰富的LUS操作人员对显示预定义结果的视频片段样本的一致性。
使用在线表格向对患者临床数据和研究目的不知情的知名LUS专家展示25个全面描绘LUS评分不同值的匿名视频片段。片段来自五台不同的超声机器。使用Fleiss-Cohen加权kappa来评估专家的一致性。
在3个月的时间里,20名经验丰富的操作人员完成了评估。大多数人在重症监护室(10人)、急诊科(6人)、高依赖病房(2人)、心脏病病房(1人)或妇产科(1人)工作。LUS评分的比例均值为15.3(标准差1.6)。评分者间的一致性各不相同:6个片段完全一致,3个片段有20名评分者中的19名达成一致,3个片段有18名达成一致,而其余13个片段在指定分数上达成一致的人数为17人或更少。0分和3分比1分和2分更具可重复性。总体答案的Fleiss' Kappa为0.87(95%置信区间0.815 - 0.931,p < 0.001)。
经验丰富的LUS操作人员之间的评分者间一致性非常高,尽管并不完美。高度的一致性和小的方差使我们能够说,LUS评分测量值周围20%的容差是患者真实LUS评分的可靠估计,从而减少评分解释的变异性并增加其临床应用的信心。