Holste Gregory, Oikonomou Evangelos K, Mortazavi Bobak J, Wang Zhangyang, Khera Rohan
Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX, USA.
Section of Cardiovascular Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, CT, USA.
Commun Med (Lond). 2024 Jul 6;4(1):133. doi: 10.1038/s43856-024-00538-3.
Advances in self-supervised learning (SSL) have enabled state-of-the-art automated medical image diagnosis from small, labeled datasets. This label efficiency is often desirable, given the difficulty of obtaining expert labels for medical image recognition tasks. However, most efforts toward SSL in medical imaging are not adapted to video-based modalities, such as echocardiography.
We developed a self-supervised contrastive learning approach, EchoCLR, for echocardiogram videos with the goal of learning strong representations for efficient fine-tuning on downstream cardiac disease diagnosis. EchoCLR pretraining involves (i) contrastive learning, where the model is trained to identify distinct videos of the same patient, and (ii) frame reordering, where the model is trained to predict the correct of video frames after being randomly shuffled.
When fine-tuned on small portions of labeled data, EchoCLR pretraining significantly improves classification performance for left ventricular hypertrophy (LVH) and aortic stenosis (AS) over other transfer learning and SSL approaches across internal and external test sets. When fine-tuning on 10% of available training data (519 studies), an EchoCLR-pretrained model achieves 0.72 AUROC (95% CI: [0.69, 0.75]) on LVH classification, compared to 0.61 AUROC (95% CI: [0.57, 0.64]) with a standard transfer learning approach. Similarly, using 1% of available training data (53 studies), EchoCLR pretraining achieves 0.82 AUROC (95% CI: [0.79, 0.84]) on severe AS classification, compared to 0.61 AUROC (95% CI: [0.58, 0.65]) with transfer learning.
EchoCLR is unique in its ability to learn representations of echocardiogram videos and demonstrates that SSL can enable label-efficient disease classification from small amounts of labeled data.
自监督学习(SSL)的进展使得能够从少量带标签的数据集中进行先进的自动化医学图像诊断。鉴于获取医学图像识别任务的专家标签存在困难,这种标签效率通常是很有必要的。然而,医学成像领域中大多数针对自监督学习的努力并不适用于基于视频的模态,如超声心动图。
我们开发了一种用于超声心动图视频的自监督对比学习方法EchoCLR,目标是学习强大的表征,以便在下游心脏病诊断中进行高效微调。EchoCLR预训练包括(i)对比学习,即训练模型识别同一患者的不同视频;以及(ii)帧重排序,即训练模型在视频帧被随机打乱后预测正确的顺序。
在少量带标签数据上进行微调时,与其他迁移学习和自监督学习方法相比,EchoCLR预训练在内部和外部测试集中显著提高了左心室肥厚(LVH)和主动脉狭窄(AS)的分类性能。在10%的可用训练数据(519项研究)上进行微调时,与标准迁移学习方法的0.61 AUROC(95% CI:[0.57, 0.64])相比,EchoCLR预训练模型在LVH分类上达到了0.72 AUROC(95% CI:[0.69, 0.75])。同样,在1%的可用训练数据(53项研究)上,与迁移学习的0.61 AUROC(95% CI:[0.58, 0.65])相比,EchoCLR预训练在严重AS分类上达到了0.82 AUROC(95% CI:[0.79, 0.84])。
EchoCLR在学习超声心动图视频表征方面具有独特能力,并证明了自监督学习能够从少量带标签数据中实现高效的疾病分类。