用于人工智能辅助评估肺部B线的即时超声平台的性能

Performance of a point-of-care ultrasound platform for artificial intelligence-enabled assessment of pulmonary B-lines.

作者信息

Labaf Ashkan, Åhman-Persson Linda, Husu Leo Silvén, Smith J Gustav, Ingvarsson Annika, Evaldsson Anna Werther

机构信息

Department of Clinical Sciences Lund, Cardiology, Section for Heart Failure and Valvular Disease, Lund University, Skåne University Hospital, Klinikgatan 15, Lund, 221 85, Sweden.

Department of Internal and Emergency Medicine, Skåne University Hospital, Malmö, Sweden.

出版信息

Cardiovasc Ultrasound. 2025 Mar 3;23(1):3. doi: 10.1186/s12947-025-00338-2.

DOI:10.1186/s12947-025-00338-2

PMID:40025516

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11874383/

Abstract

BACKGROUND

The incorporation of artificial intelligence (AI) into point-of-care ultrasound (POCUS) platforms has rapidly increased. The number of B-lines present on lung ultrasound (LUS) serve as a useful tool for the assessment of pulmonary congestion. Interpretation, however, requires experience and therefore AI automation has been pursued. This study aimed to test the agreement between the AI software embedded in a major vendor POCUS system and visual expert assessment.

METHODS

This single-center prospective study included 55 patients hospitalized for various respiratory symptoms, predominantly acutely decompensated heart failure. A 12-zone protocol was used. Two experts in LUS independently categorized B-lines into 0, 1-2, 3-4, and ≥ 5. The intraclass correlation coefficient (ICC) was used to determine agreement.

RESULTS

A total of 672 LUS zones were obtained, with 584 (87%) eligible for analysis. Compared with expert reviewers, the AI significantly overcounted number of B-lines per patient (23.5 vs. 2.8, p < 0.001). A greater proportion of zones with > 5 B-lines was found by the AI than by the reviewers (38% vs. 4%, p < 0.001). The ICC between the AI and reviewers was 0.28 for the total sum of B-lines and 0.37 for the zone-by-zone method. The interreviewer agreement was excellent, with ICCs of 0.92 and 0.91, respectively.

CONCLUSION

This study demonstrated excellent interrater reliability of B-line counts from experts but poor agreement with the AI software embedded in a major vendor system, primarily due to overcounting. Our findings indicate that further development is needed to increase the accuracy of AI tools in LUS.

摘要

背景

人工智能（AI）在床旁超声（POCUS）平台中的应用迅速增加。肺部超声（LUS）上的B线数量是评估肺充血的有用工具。然而，解读需要经验，因此人们一直在追求AI自动化。本研究旨在测试主要供应商POCUS系统中嵌入的AI软件与视觉专家评估之间的一致性。

方法

这项单中心前瞻性研究纳入了55例因各种呼吸道症状住院的患者，主要是急性失代偿性心力衰竭患者。采用12区方案。两名LUS专家将B线独立分类为0、1 - 2、3 - 4和≥5。组内相关系数（ICC）用于确定一致性。

结果

共获得672个LUS区域，其中584个（87%）符合分析条件。与专家评审员相比，AI显著高估了每位患者的B线数量（23.5对2.8，p < 0.001）。AI发现的B线>5条的区域比例高于评审员（38%对4%，p < 0.001）。AI与评审员之间B线总数的ICC为0.28，逐区方法的ICC为0.37。评审员之间的一致性非常好，ICC分别为0.92和0.91。

结论

本研究表明专家对B线计数的评分者间可靠性极佳，但与主要供应商系统中嵌入的AI软件一致性较差，主要原因是计数过多。我们的研究结果表明，需要进一步开发以提高LUS中AI工具的准确性。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

用于人工智能辅助评估肺部B线的即时超声平台的性能

Performance of a point-of-care ultrasound platform for artificial intelligence-enabled assessment of pulmonary B-lines.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSION

背景

方法

结果

结论

相似文献

本文引用的文献

用于人工智能辅助评估肺部B线的即时超声平台的性能

Performance of a point-of-care ultrasound platform for artificial intelligence-enabled assessment of pulmonary B-lines.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSION

背景

方法

结果

结论

相似文献

本文引用的文献