Division of Oncology, Medical University of Graz, Auenbruggerplatz 15, 8036, Graz, Austria.
Institute of Psychology, University of Graz, Graz, Austria.
Sci Rep. 2021 Nov 16;11(1):22292. doi: 10.1038/s41598-021-01779-1.
Most cancer patients exhibit autonomic dysfunction with attenuated heart rate variability (HRV) levels compared to healthy controls. This research aimed to create and evaluate a machine learning (ML) model enabling discrimination between cancer patients and healthy controls based on 5-min-ECG recordings. We selected 12 HRV features based on previous research and compared the results between cancer patients and healthy individuals using Wilcoxon sum-rank test. Recursive Feature Elimination (RFE) identified the top five features, averaged over 5 min and employed them as input to three different ML. Next, we created an ensemble model based on a stacking method that aggregated the predictions from all three base classifiers. All HRV features were significantly different between the two groups. SDNN, RMSSD, pNN50%, HRV triangular index, and SD1 were selected by RFE and used as an input to three different ML. All three base-classifiers performed above chance level, RF being the most efficient with a testing accuracy of 83%. The ensemble model showed a classification accuracy of 86% and an AUC of 0.95. The results obtained by ML algorithms suggest HRV parameters could be a reliable input for differentiating between cancer patients and healthy controls. Results should be interpreted in light of some limitations that call for replication studies with larger sample sizes.
与健康对照组相比,大多数癌症患者表现出自主神经功能障碍,心率变异性 (HRV) 水平降低。本研究旨在创建和评估一种机器学习 (ML) 模型,能够基于 5 分钟 ECG 记录区分癌症患者和健康对照者。我们根据先前的研究选择了 12 个 HRV 特征,并使用 Wilcoxon 总和秩检验比较了癌症患者和健康个体之间的结果。递归特征消除 (RFE) 确定了前五个特征,将它们在 5 分钟内平均,并将其用作三个不同 ML 的输入。接下来,我们创建了一个基于堆叠方法的集成模型,该方法汇总了所有三个基础分类器的预测。所有 HRV 特征在两组之间均有显著差异。SDNN、RMSSD、pNN50%、HRV 三角指数和 SD1 通过 RFE 选择,并用作三个不同 ML 的输入。所有三个基础分类器的表现均高于机会水平,RF 的效率最高,测试准确率为 83%。集成模型的分类准确率为 86%,AUC 为 0.95。ML 算法的结果表明,HRV 参数可能是区分癌症患者和健康对照者的可靠输入。应根据一些限制来解释结果,这些限制需要进行具有更大样本量的复制研究。