Department of Emergency Medicine, Rush University Medical Center, Chicago, IL, United States of America.
Rush Medical College, Chicago, IL, United States of America.
Am J Emerg Med. 2023 Aug;70:109-112. doi: 10.1016/j.ajem.2023.05.029. Epub 2023 May 26.
Lung ultrasound can evaluate for pulmonary edema, but data suggest moderate inter-rater reliability among users. Artificial intelligence (AI) has been proposed as a model to increase the accuracy of B line interpretation. Early data suggest a benefit among more novice users, but data are limited among average residency-trained physicians. The objective of this study was to compare the accuracy of AI versus real-time physician assessment for B lines.
This was a prospective, observational study of adult Emergency Department patients presenting with suspected pulmonary edema. We excluded patients with active COVID-19 or interstitial lung disease. A physician performed thoracic ultrasound using the 12-zone technique. The physician recorded a video clip in each zone and provided an interpretation of positive (≥3 B lines or a wide, dense B line) or negative (<3 B lines and the absence of a wide, dense B line) for pulmonary edema based upon the real-time assessment. A research assistant then utilized the AI program to analyze the same saved clip to determine if it was positive versus negative for pulmonary edema. The physician sonographer was blinded to this assessment. The video clips were then reviewed independently by two expert physician sonographers (ultrasound leaders with >10,000 prior ultrasound image reviews) who were blinded to the AI and initial determinations. The experts reviewed all discordant values and reached consensus on whether the field (i.e., the area of lung between two adjacent ribs) was positive or negative using the same criteria as defined above, which served as the gold standard.
71 patients were included in the study (56.3% female; mean BMI: 33.4 [95% CI 30.6-36.2]), with 88.3% (752/852) of lung fields being of adequate quality for assessment. Overall, 36.1% of lung fields were positive for pulmonary edema. The physician was 96.7% (95% CI 93.8%-98.5%) sensitive and 79.1% (95% CI 75.1%-82.6%) specific. The AI software was 95.6% (95% CI 92.4%-97.7%) sensitive and 64.1% (95% CI 59.8%-68.5%) specific.
Both the physician and AI software were highly sensitive, though the physician was more specific. Future research should identify which factors are associated with increased diagnostic accuracy.
肺部超声可评估肺水肿,但数据表明使用者之间的中度观察者间可靠性。人工智能(AI)已被提议作为提高 B 线解释准确性的模型。早期数据表明,对于更初级的用户有好处,但在普通住院医师培训医生中数据有限。本研究的目的是比较 AI 与实时医生评估 B 线的准确性。
这是一项前瞻性、观察性研究,纳入了急诊科疑似肺水肿的成年患者。我们排除了患有活动性 COVID-19 或间质性肺病的患者。医生使用 12 区技术进行胸部超声检查。医生记录每个区的视频片段,并根据实时评估对肺水肿进行阳性(≥3 条 B 线或宽而密集的 B 线)或阴性(<3 条 B 线且不存在宽而密集的 B 线)的解读。然后,研究助理利用 AI 程序分析相同的保存片段,以确定其是否为肺水肿阳性或阴性。超声医师对该评估不知情。然后,两名专家超声医师(具有>10000 次超声图像审查经验的超声领导者)独立审查视频片段,他们对 AI 和初始判断均不知情。专家审查了所有不一致的值,并根据上述相同标准就该区域(即两个相邻肋骨之间的肺区域)是阳性还是阴性达成共识,该标准作为金标准。
71 名患者纳入研究(56.3%为女性;平均 BMI:33.4[95%CI 30.6-36.2]),其中 88.3%(752/852)的肺区评估质量足够。总体而言,36.1%的肺区为肺水肿阳性。医生的敏感度为 96.7%(95%CI 93.8%-98.5%),特异性为 79.1%(95%CI 75.1%-82.6%)。AI 软件的敏感度为 95.6%(95%CI 92.4%-97.7%),特异性为 64.1%(95%CI 59.8%-68.5%)。
医生和 AI 软件都具有高度的敏感性,尽管医生的特异性更高。未来的研究应确定哪些因素与提高诊断准确性有关。