ViViEchoformer：预测射血分数的深度视频回归器

ViViEchoformer: Deep Video Regressor Predicting Ejection Fraction.

作者信息

Akan Taymaz, Alp Sait, Bhuiyan Md Shenuarin, Helmy Tarek, Orr A Wayne, Rahman Bhuiyan Md Mostafizur, Conrad Steven A, Vanchiere John A, Kevil Christopher G, Bhuiyan Mohammad A N

机构信息

Department of Medicine, Louisiana State University Health Sciences Center at Shreveport, Shreveport, LA 71103, USA.

Department of Computer Engineering, Erzurum Technical University, Erzurum, Turkey.

出版信息

medRxiv. 2024 Jun 22:2024.06.21.24309327. doi: 10.1101/2024.06.21.24309327.

DOI:10.1101/2024.06.21.24309327

PMID:38947006

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11213045/

Abstract

Heart disease is the leading cause of death worldwide, and cardiac function as measured by ejection fraction (EF) is an important determinant of outcomes, making accurate measurement a critical parameter in PT evaluation. Echocardiograms are commonly used for measuring EF, but human interpretation has limitations in terms of intra- and inter-observer (or reader) variance. Deep learning (DL) has driven a resurgence in machine learning, leading to advancements in medical applications. We introduce the ViViEchoformer DL approach, which uses a video vision transformer to directly regress the left ventricular function (LVEF) from echocardiogram videos. The study used a dataset of 10,030 apical-4-chamber echocardiography videos from patients at Stanford University Hospital. The model accurately captures spatial information and preserves inter-frame relationships by extracting spatiotemporal tokens from video input, allowing for accurate, fully automatic EF predictions that aid human assessment and analysis. The ViViEchoformer's prediction of ejection fraction has a mean absolute error of 6.14%, a root mean squared error of 8.4%, a mean squared log error of 0.04, and an of 0.55. ViViEchoformer predicted heart failure with reduced ejection fraction (HFrEF) with an area under the curve of 0.83 and a classification accuracy of 87 using a standard threshold of less than 50% ejection fraction. Our video-based method provides precise left ventricular function quantification, offering a reliable alternative to human evaluation and establishing a fundamental basis for echocardiogram interpretation.

摘要

心脏病是全球主要死因，通过射血分数（EF）衡量的心脏功能是预后的重要决定因素，准确测量是PT评估中的关键参数。超声心动图常用于测量EF，但人工解读在观察者内和观察者间（或阅片者间）的差异方面存在局限性。深度学习（DL）推动了机器学习的复兴，带来了医学应用的进步。我们介绍了ViViEchoformer深度学习方法，该方法使用视频视觉变换器直接从超声心动图视频中回归左心室功能（LVEF）。该研究使用了来自斯坦福大学医院患者的10030个心尖四腔心超声心动图视频数据集。该模型通过从视频输入中提取时空令牌准确捕获空间信息并保留帧间关系，从而实现准确、全自动的EF预测，有助于人工评估和分析。ViViEchoformer对射血分数的预测平均绝对误差为6.14%，均方根误差为8.4%，均方对数误差为0.04，相关系数为0.55。ViViEchoformer使用小于50%射血分数的标准阈值预测射血分数降低的心力衰竭（HFrEF），曲线下面积为0.83，分类准确率为87%。我们基于视频的方法提供了精确的左心室功能量化，为人工评估提供了可靠的替代方案，并为超声心动图解读奠定了基础。