Hodgman Matthew, Minoccheri Cristian, Mathis Michael, Wittrup Emily, Najarian Kayvan
Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA.
Department of Anesthesiology, University of Michigan, Ann Arbor, MI 48109, USA.
Diagnostics (Basel). 2024 Aug 10;14(16):1741. doi: 10.3390/diagnostics14161741.
Acute myocardial infarctions are deadly to patients and burdensome to healthcare systems. Most recorded infarctions are patients' first, occur out of the hospital, and often are not accompanied by cardiac comorbidities. The clinical manifestations of the underlying pathophysiology leading to an infarction are not fully understood and little effort exists to use explainable machine learning to learn predictive clinical phenotypes before hospitalization is needed.
We extracted outpatient electronic health record data for 2641 case and 5287 matched-control patients, all without pre-existing cardiac diagnoses, from the Michigan Medicine Health System. We compare six different interpretable, feature extraction approaches, including temporal computational phenotyping, and train seven interpretable machine learning models to predict the onset of first acute myocardial infarction within six months.
Using temporal computational phenotypes significantly improved the model performance compared to alternative approaches. The mean cross-validation test set performance exhibited area under the receiver operating characteristic curve values as high as 0.674. The most consistently predictive phenotypes of a future infarction include back pain, cardiometabolic syndrome, family history of cardiovascular diseases, and high blood pressure.
Computational phenotyping of longitudinal health records can improve classifier performance and identify predictive clinical concepts. State-of-the-art interpretable machine learning approaches can augment acute myocardial infarction risk assessment and prioritize potential risk factors for further investigation and validation.
急性心肌梗死对患者来说是致命的,也给医疗系统带来负担。大多数已记录的梗死是患者首次发生的,发生在院外,且通常不伴有心脏合并症。导致梗死的潜在病理生理学的临床表现尚未完全了解,并且几乎没有努力利用可解释的机器学习来在需要住院治疗之前了解预测性临床表型。
我们从密歇根医学健康系统中提取了2641例病例和5287例匹配对照患者的门诊电子健康记录数据,所有患者均无既往心脏诊断。我们比较了六种不同的可解释特征提取方法,包括时间计算表型分析,并训练了七个可解释的机器学习模型来预测六个月内首次急性心肌梗死的发作。
与其他方法相比,使用时间计算表型分析显著提高了模型性能。平均交叉验证测试集性能显示受试者工作特征曲线下面积值高达0.674。未来梗死最一致的预测表型包括背痛、心脏代谢综合征、心血管疾病家族史和高血压。
纵向健康记录的计算表型分析可以提高分类器性能并识别预测性临床概念。最先进的可解释机器学习方法可以增强急性心肌梗死风险评估,并为进一步调查和验证潜在风险因素确定优先级。