人工智能压力测试：基于正式超声心动图训练的深度学习模型能否准确解读即时超声？

A Stress Test of Artificial Intelligence: Can Deep Learning Models Trained From Formal Echocardiography Accurately Interpret Point-of-Care Ultrasound?

机构信息

Department of Emergency Medicine, University of Utah, Salt Lake City, UT, USA.

University of Utah School of Medicine, Salt Lake City, UT, USA.

出版信息

J Ultrasound Med. 2022 Dec;41(12):3003-3012. doi: 10.1002/jum.16007. Epub 2022 May 12.

DOI:10.1002/jum.16007

PMID:35560254

Abstract

OBJECTIVES

To test if a deep learning (DL) model trained on echocardiography images could accurately segment the left ventricle (LV) and predict ejection fraction on apical 4-chamber images acquired by point-of-care ultrasound (POCUS).

METHODS

We created a dataset of 333 videos from cardiac POCUS exams acquired in the emergency department. For each video we derived two ground-truth labels. First, we segmented the LV from one image frame and second, we classified the EF as normal, reduced, or severely reduced. We then classified the media's quality as optimal, adequate, or inadequate. With this dataset we tested the accuracy of automated LV segmentation and EF classification by the best-in-class echocardiography trained DL model EchoNet-Dynamic.

RESULTS

The mean Dice similarity coefficient for LV segmentation was 0.72 (N = 333; 95% CI 0.70-0.74). Cohen's kappa coefficient for agreement between predicted and ground-truth EF classification was 0.16 (N = 333). The area under the receiver-operating curve for the diagnosis of heart failure was 0.74 (N = 333). Model performance improved with video quality for the tasks of LV segmentation and diagnosis of heart failure, but was unchanged with EF classification. For all tasks the model was less accurate than the published benchmarks for EchoNet-Dynamic.

CONCLUSIONS

Performance of a DL model trained on formal echocardiography worsened when challenged with images captured during resuscitations. DL models intended for assessing bedside ultrasound should be trained on datasets composed of POCUS images. Such datasets have yet to be made publicly available.

摘要

目的

测试一个基于超声心动图图像的深度学习（DL）模型是否能够准确地对经床旁超声（POCUS）获取的心尖 4 腔图像进行左心室（LV）分段并预测射血分数。

方法

我们创建了一个包含 333 个来自急诊科心脏 POCUS 检查的视频数据集。对于每个视频，我们得出了两个真实标签。首先，我们从一个图像帧中分割 LV，其次，我们将 EF 分类为正常、降低或严重降低。然后我们将媒体质量分类为最佳、足够或不足。使用这个数据集，我们通过最佳的超声心动图训练的 DL 模型 EchoNet-Dynamic 测试了自动 LV 分段和 EF 分类的准确性。

结果

LV 分段的平均 Dice 相似系数为 0.72（N=333；95%CI 0.70-0.74）。预测 EF 分类与真实 EF 分类之间的 Cohen's kappa 系数为 0.16（N=333）。用于诊断心力衰竭的接收者操作特征曲线下面积为 0.74（N=333）。对于 LV 分段和心力衰竭诊断任务，模型性能随着视频质量的提高而提高，但 EF 分类则不变。对于所有任务，该模型的性能都不如 EchoNet-Dynamic 的已发表基准差。

结论

在对复苏期间捕获的图像进行挑战时，经过正式超声心动图训练的 DL 模型的性能会恶化。用于评估床边超声的 DL 模型应在由 POCUS 图像组成的数据集上进行训练。这些数据集尚未公开。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

人工智能压力测试：基于正式超声心动图训练的深度学习模型能否准确解读即时超声？

A Stress Test of Artificial Intelligence: Can Deep Learning Models Trained From Formal Echocardiography Accurately Interpret Point-of-Care Ultrasound?

机构信息

出版信息

OBJECTIVES

METHODS

RESULTS

CONCLUSIONS

目的

方法

结果

结论

相似文献

引用本文的文献

人工智能压力测试：基于正式超声心动图训练的深度学习模型能否准确解读即时超声？

A Stress Test of Artificial Intelligence: Can Deep Learning Models Trained From Formal Echocardiography Accurately Interpret Point-of-Care Ultrasound?

机构信息

出版信息

OBJECTIVES

METHODS

RESULTS

CONCLUSIONS

目的

方法

结果

结论

相似文献

引用本文的文献