Sensory-Motor Systems (SMS) Lab, Department of Health Sciences and Technology, ETH Zurich, Switzerland.
Sleep-Wake-Epilepsy-Center, Department of Neurology, Bern University Hospital (Inselspital), Switzerland.
Physiol Meas. 2022 Sep 5;43(9). doi: 10.1088/1361-6579/ac89cb.
. Learning to classify cardiac abnormalities requires large and high-quality labeled datasets, which is a challenge in medical applications. Small datasets from various sources are often aggregated to meet this requirement, resulting in a final dataset prone to label noise due to inter- and intra-observer variability and different expertise. It is well known that label noise can affect the performance and generalizability of the trained models. In this work, we explore the impact of label noise and self-learning label correction on the classification of cardiac abnormalities on large heterogeneous datasets of electrocardiogram (ECG) signals.A state-of-the-art self-learning multi-class label correction method for image classification is adapted to learn a multi-label classifier for electrocardiogram signals. We evaluated our performance using 5-fold cross-validation on the publicly available PhysioNet/Computing in Cardiology (CinC) 2021 Challenge data, with full and reduced sets of leads. Due to the unknown label noise in the testing set, we tested our approach on the MNIST dataset. We investigated the performance under different levels of structured label noise for both datasets.Under high levels of noise, the cross-validation results of self-learning label correction show an improvement of approximately 3% in the challenge score for the PhysioNet/CinC 2021 Challenge dataset and an improvement in accuracy of 5% and reduction of the expected calibration error of 0.03 for the MNIST dataset. We demonstrate that self-learning label correction can be used to effectively deal with the presence of unknown label noise, also when using a reduced number of ECG leads.
. 学习对心脏异常进行分类需要大量高质量的带标签数据集,这在医学应用中是一个挑战。通常会汇总来自不同来源的小数据集,以满足这一要求,从而导致最终数据集因观察者间和观察者内的变异性以及不同专业知识而容易出现标签噪声。众所周知,标签噪声会影响训练模型的性能和泛化能力。在这项工作中,我们探讨了标签噪声和自学习标签校正对心电图 (ECG) 信号的心脏异常分类的影响。我们将一种用于图像分类的先进的自学习多标签校正方法应用于学习用于心电图信号的多标签分类器。我们在公开的 PhysioNet/Computing in Cardiology (CinC) 2021 挑战赛数据上使用 5 折交叉验证评估了我们的性能,包括全导联和减少导联。由于测试集中的标签噪声未知,我们在 MNIST 数据集上测试了我们的方法。我们研究了两个数据集在不同程度结构化标签噪声下的性能。在高噪声水平下,自学习标签校正的交叉验证结果表明,PhysioNet/CinC 2021 挑战赛数据集的挑战得分提高了约 3%,MNIST 数据集的准确率提高了 5%,预期校准误差降低了 0.03。我们证明了自学习标签校正可以有效地处理未知标签噪声的存在,即使使用较少数量的心电图导联也是如此。