Loh De Rong, Yeo Si Yong, Tan Ru San, Gao Fei, Koh Angela S
Duke-NUS Medical School, 8 College Road, Singapore 169857, Singapore.
Singapore University of Technology and Design, 8 Somapah Road, Singapore 487372, Singapore.
Eur Heart J Digit Health. 2021 Nov 4;3(1):49-55. doi: 10.1093/ehjdh/ztab096. eCollection 2022 Mar.
A widely practiced intervention to modify cardiac health, the effect of physical activity on older adults is likely heterogeneous. While machine learning (ML) models that combine various systemic signals may aid in predictive modelling, the inability to rationalize predictions at a patient personalized level is a major shortcoming in the current field of ML.
We applied a novel methodology, SHapley Additive exPlanations (SHAP), on a dataset of older adults = 86 (mean age 72 ± 4 years) whose physical activity levels were studied alongside changes in their left ventricular (LV) structure. SHAP was tested to provide intelligible visualization on the magnitude of the impact of the features in their physical activity levels on their LV structure. As proof of concept, using repeated K-cross-validation on the train set ( = 68), we found the Random Forest Regressor with the most optimal hyperparameters, which achieved the lowest mean squared error. With the trained model, we evaluated its performance by reporting its mean absolute error and plotting the correlation on the test set ( = 18). Based on collective force plot, individually numbered patients are indicated on the horizontal axis, and each bandwidth implies the magnitude (i.e. effect) of physical parameters (higher in red; lower in blue) towards prediction of their LV structure.
As a tool that identified specific features in physical activity that predicted cardiac structure on a per patient level, our findings support a role for explainable ML to be incorporated into personalized cardiology strategies.
作为一种广泛应用于改善心脏健康的干预措施,体育活动对老年人的影响可能存在异质性。虽然结合各种系统信号的机器学习(ML)模型可能有助于预测建模,但在患者个性化水平上无法对预测结果进行合理解释是当前ML领域的一个主要缺点。
我们将一种新颖的方法——SHapley加性解释(SHAP)应用于一个包含86名老年人(平均年龄72±4岁)的数据集,该数据集对他们的体育活动水平以及左心室(LV)结构变化进行了研究。对SHAP进行测试,以提供关于体育活动水平特征对LV结构影响程度的清晰可视化。作为概念验证,我们在训练集(n = 68)上使用重复K折交叉验证,找到了具有最优超参数的随机森林回归器,其实现了最低的均方误差。使用训练好的模型,我们通过报告其平均绝对误差并绘制测试集(n = 18)上的相关性来评估其性能。基于集体力图,水平轴上显示了单独编号的患者,每个带宽表示物理参数对其LV结构预测的大小(即影响)(红色表示较高;蓝色表示较低)。
作为一种能够在个体患者水平上识别预测心脏结构的体育活动特定特征的工具,我们的研究结果支持将可解释的ML纳入个性化心脏病学策略。