Machine Learning Research Group, Aalen University, Aalen, Germany.
PLoS One. 2020 Dec 17;15(12):e0243615. doi: 10.1371/journal.pone.0243615. eCollection 2020.
We present the results from a white-box machine learning approach to detect cardiac arrhythmias using electrocardiographic data. A C5.0 is trained to recognize four classes using common features. The four classes are (i) atrial fibrillation and atrial flutter, (ii) tachycardias (iii), sinus bradycardia and (iv) sinus rhythm. Data from 10,646 subjects, 83% of whom have at least one arrhythmia and 17% of whom exhibit a normal sinus rhythm, are used. The C5.0 is trained using 10-fold cross-validation and is able to achieve a balanced accuracy of 95.35%. By using the white-box machine learning approach, a clear and comprehensible tree structure can be revealed, which has selected the 5 most important features from a total of 24 features. These 5 features are ventricular rate, RR-Interval variation, atrial rate, age and difference between longest and shortest RR-Interval. The combination of ventricular rate, RR-Interval variation and atrial rate is especially relevant to achieve classification accuracy, which can be disclosed through the tree. The tree assigns unique values to distinguish the classes. These findings could be applied in medicine in the future. It can be shown that a white-box machine learning approach can reveal granular structures, thus confirming known linear relationships and also revealing nonlinear relationships. To highlight the strength of the C5.0 with respect to this structural revelation, the results of further white-box machine learning and black-box machine learning algorithms are presented.
我们提出了一种使用心电图数据通过白盒机器学习方法检测心律失常的结果。使用常见特征对 C5.0 进行训练以识别四个类别。这四个类别是(i)心房颤动和心房扑动,(ii)心动过速,(iii)窦性心动过缓和(iv)窦性节律。使用了来自 10646 名受试者的数据,其中 83%至少有一种心律失常,17%表现出正常窦性节律。C5.0 使用 10 折交叉验证进行训练,能够达到 95.35%的平衡准确性。通过使用白盒机器学习方法,可以揭示清晰易懂的树状结构,该结构从总共 24 个特征中选择了 5 个最重要的特征。这 5 个特征是心室率、RR 间隔变化、心房率、年龄和最长与最短 RR 间隔之间的差异。心室率、RR 间隔变化和心房率的组合对于实现分类准确性特别重要,这可以通过树状结构揭示。树为区分不同类别分配了唯一的值。这些发现将来可能会在医学中得到应用。可以证明,白盒机器学习方法可以揭示颗粒状结构,从而确认已知的线性关系,同时也揭示非线性关系。为了突出 C5.0 在这种结构揭示方面的优势,还呈现了进一步的白盒机器学习和黑盒机器学习算法的结果。