Huang Zhe, Long Gary, Wessler Benjamin, Hughes Michael C
Dept. of Computer Science, Tufts University, Medford, MA, USA.
CVAI Solutions, Dorchester, MA.
Proc Mach Learn Res. 2021 Aug;149:614-647.
Semi-supervised image classification has shown substantial progress in learning from limited labeled data, but recent advances remain largely untested for clinical applications. Motivated by the urgent need to improve timely diagnosis of life-threatening heart conditions, especially aortic stenosis, we develop a benchmark dataset to assess semi-supervised approaches to two tasks relevant to cardiac ultrasound (echocardiogram) interpretation: view classification and disease severity classification. We find that a state-of-the-art method called MixMatch achieves promising gains in heldout accuracy on both tasks, learning from a large volume of truly unlabeled images as well as a labeled set collected at great expense to achieve better performance than is possible with the labeled set alone. We further pursue diagnosis prediction, which requires aggregating across hundreds of images of diverse view types, most of which are irrelevant, to make a coherent prediction. The best patient-level performance is achieved by new methods that diagnosis predictions from images that are predicted to be clinically-relevant views and knowledge from the view task to the diagnosis task. We hope our released dataset and evaluation framework inspire further improvements in multi-task semi-supervised learning for clinical applications.
半监督图像分类在利用有限的标记数据进行学习方面已取得显著进展,但近期的进展在临床应用中大多尚未得到检验。出于迫切需要改善对危及生命的心脏疾病(尤其是主动脉瓣狭窄)的及时诊断的动机,我们开发了一个基准数据集,以评估半监督方法在与心脏超声(超声心动图)解读相关的两项任务上的表现:视图分类和疾病严重程度分类。我们发现,一种名为MixMatch的先进方法在这两项任务的验证准确性上取得了可观的提升,它从大量真正未标记的图像以及一个花费巨大收集的标记集进行学习,从而实现了比仅使用标记集更好的性能。我们进一步进行诊断预测,这需要汇总数百张不同视图类型的图像(其中大部分是不相关的)以做出连贯的预测。通过将预测为临床相关视图的图像的诊断预测以及将视图任务的知识转移到诊断任务的新方法,可实现最佳的患者水平性能。我们希望我们发布的数据集和评估框架能激发临床应用中多任务半监督学习的进一步改进。