Institute of Medical Informatics, University of Münster, Münster, Germany.
Stud Health Technol Inform. 2023 May 18;302:182-186. doi: 10.3233/SHTI230099.
Deep Learning architectures for time series require a large number of training samples, however traditional sample size estimation for sufficient model performance is not applicable for machine learning, especially in the field of electrocardiograms (ECGs). This paper outlines a sample size estimation strategy for binary classification problems on ECGs using different deep learning architectures and the large publicly available PTB-XL dataset, which includes 21801 ECG samples. This work evaluates binary classification tasks for Myocardial Infarction (MI), Conduction Disturbance (CD), ST/T Change (STTC), and Sex. All estimations are benchmarked across different architectures, including XResNet, Inception-, XceptionTime and a fully convolutional network (FCN). The results indicate trends for required sample sizes for given tasks and architectures, which can be used as orientation for future ECG studies or feasibility aspects.
深度学习架构需要大量的训练样本,但是传统的样本量估计方法对于机器学习并不适用,尤其是在心电图(ECG)领域。本文概述了一种使用不同深度学习架构和大型公共 PTB-XL 数据集(包含 21801 个 ECG 样本)对 ECG 进行二进制分类问题的样本量估计策略。这项工作评估了心肌梗死(MI)、传导障碍(CD)、ST/T 变化(STTC)和性别等的二进制分类任务。所有估计都是在不同的架构中进行基准测试的,包括 XResNet、Inception、XceptionTime 和全卷积网络(FCN)。结果表明了给定任务和架构所需样本量的趋势,可以作为未来 ECG 研究或可行性方面的参考。