Department of Cardiovascular Medicine, Tokushima University Hospital, Tokushima 770-8503, Japan.
Department of Medical Image Informatics, Graduate School of Biomedical Sciences, Tokushima University, Tokushima 770-8503, Japan.
Biomolecules. 2020 Apr 25;10(5):665. doi: 10.3390/biom10050665.
A proper echocardiographic study requires several video clips recorded from different acquisition angles for observation of the complex cardiac anatomy. However, these video clips are not necessarily labeled in a database. Identification of the acquired view becomes the first step of analyzing an echocardiogram. Currently, there is no consensus whether the mislabeled samples can be used to create a feasible clinical prediction model of ejection fraction (EF). The aim of this study was to test two types of input methods for the classification of images, and to test the accuracy of the prediction model for EF in a learning database containing mislabeled images that were not checked by observers. We enrolled 340 patients with five standard views (long axis, short axis, 3-chamber view, 4-chamber view and 2-chamber view) and 10 images in a cycle, used for training a convolutional neural network to classify views (total 17,000 labeled images). All DICOM images were rigidly registered and rescaled into a reference image to fit the size of echocardiographic images. We employed 5-fold cross validation to examine model performance. We tested models trained by two types of data, averaged images and 10 selected images. Our best model (from 10 selected images) classified video views with 98.1% overall test accuracy in the independent cohort. In our view classification model, 1.9% of the images were mislabeled. To determine if this 98.1% accuracy was acceptable for creating the clinical prediction model using echocardiographic data, we tested the prediction model for EF using learning data with a 1.9% error rate. The accuracy of the prediction model for EF was warranted, even with training data containing 1.9% mislabeled images. The CNN algorithm can classify images into five standard views in a clinical setting. Our results suggest that this approach may provide a clinically feasible accuracy level of view classification for the analysis of echocardiographic data.
一个恰当的超声心动图研究需要从不同采集角度记录多个视频片段,以便观察复杂的心脏解剖结构。然而,这些视频片段在数据库中并不一定带有标签。获取视图的识别成为分析超声心动图的第一步。目前,对于带有错误标签的样本是否可以用于创建射血分数(EF)的可行临床预测模型,尚无共识。本研究旨在测试两种类型的图像分类输入方法,并在一个包含未经观察者检查的错误标记图像的学习数据库中测试 EF 预测模型的准确性。我们招募了 340 名患者,每个患者均有五个标准视图(长轴、短轴、三腔心视图、四腔心视图和两腔心视图)和 10 个循环图像,用于训练一个卷积神经网络来对视图进行分类(总共 17000 个标记图像)。所有 DICOM 图像均经过刚性配准并重新缩放为参考图像,以适应超声心动图图像的大小。我们采用 5 折交叉验证来检查模型性能。我们测试了两种类型的数据训练的模型,即平均图像和 10 个选定图像。我们最好的模型(从 10 个选定图像中)在独立队列中以 98.1%的整体测试准确性对视频视图进行分类。在我们的视图分类模型中,有 1.9%的图像被错误标记。为了确定使用超声心动图数据创建临床预测模型时,这种 98.1%的准确性是否可以接受,我们使用学习数据进行了 EF 预测模型测试,学习数据的错误率为 1.9%。EF 预测模型的准确性是有保证的,即使训练数据中包含 1.9%的错误标记图像。CNN 算法可以在临床环境中将图像分类为五个标准视图。我们的结果表明,这种方法可能为超声心动图数据的分析提供可接受的视图分类准确度。