Escudero-Arnanz Óscar, Marques Antonio G, Mora-Jiménez Inmaculada, Álvarez-Rodríguez Joaquín, Soguero-Ruiz Cristina
Department of Signal Theory and Communications, King Juan Carlos University, Camino del Molino, 5, Fuenlabrada, 28942, Madrid, Spain.
University Hospital of Fuenlabrada, Camino del Molino, 2, Fuenlabrada, 28942, Madrid, Spain.
Comput Methods Programs Biomed. 2025 Oct;270:108920. doi: 10.1016/j.cmpb.2025.108920. Epub 2025 Jul 12.
Multidrug Resistance has been identified by the World Health Organization as a major global health threat. It leads to severe social and economic consequences, including extended hospital stays, increased healthcare costs, and higher mortality rates. In response to this challenge, this study proposes a novel interpretable Machine Learning (ML) approach for predicting MDR, developed with two primary objectives: accurate inference and enhanced explainability.
For inference, the proposed method is based on patient-to-patient similarity representations to predict MDR outcomes. Each patient is modeled as a Multivariate Time Series (MTS), capturing both clinical progression and interactions with similar patients. To quantify these relationships, we employ MTS-based similarity metrics, including feature engineering using descriptive statistics, Dynamic Time Warping, and the Time Cluster Kernel. These methods are used as inputs for MDR classification through Logistic Regression, Random Forest, and Support Vector Machines, with dimensionality reduction and kernel transformations applied to enhance model performance. For explainability, we employ graph-based methods to extract meaningful patterns from the data. Patient similarity networks are generated using the MTS-based similarity metrics mentioned above, while spectral clustering and t-SNE are applied to identify MDR-related subgroups, uncover clinically relevant patterns, and visualize high-risk clusters. These insights improve interpretability and support more informed decision-making in critical care settings.
We validate our architecture on real-world Electronic Health Records from the Intensive Care Unit (ICU) dataset at the University Hospital of Fuenlabrada, achieving a Receiver Operating Characteristic Area Under the Curve of 81%. Our framework surpasses ML and deep learning models on the same dataset by leveraging graph-based patient similarity. In addition, it offers a simple yet effective interpretability mechanism that facilitates the identification of key risk factors-such as prolonged antibiotic exposure, invasive procedures, co-infections, and extended ICU stays-and the discovery of clinically meaningful patient clusters. For transparency, all results and code are available at https://github.com/oscarescuderoarnanz/DM4MTS.
This study demonstrates the effectiveness of patient similarity representations and graph-based methods for MDR prediction and interpretability. The approach enhances prediction, identifies key risk factors, and improves patient stratification, enabling early detection and targeted interventions, highlighting the potential of interpretable ML in critical care.
世界卫生组织已将多重耐药性确定为全球主要的健康威胁。它会导致严重的社会和经济后果,包括住院时间延长、医疗成本增加以及死亡率上升。为应对这一挑战,本研究提出了一种用于预测多重耐药性的新型可解释机器学习(ML)方法,该方法有两个主要目标:准确推断和增强可解释性。
对于推断,所提出的方法基于患者与患者之间的相似性表示来预测多重耐药性结果。每个患者被建模为一个多元时间序列(MTS),捕捉临床进展以及与相似患者的相互作用。为了量化这些关系,我们采用基于MTS的相似性度量,包括使用描述性统计、动态时间规整和时间聚类核进行特征工程。这些方法通过逻辑回归、随机森林和支持向量机用作多重耐药性分类的输入,并应用降维和核变换来提高模型性能。为了实现可解释性,我们采用基于图的方法从数据中提取有意义的模式。使用上述基于MTS的相似性度量生成患者相似性网络,同时应用谱聚类和t-SNE来识别与多重耐药性相关的亚组、发现临床相关模式并可视化高风险聚类。这些见解提高了可解释性,并支持在重症监护环境中做出更明智的决策。
我们在富恩拉夫拉达大学医院重症监护病房(ICU)数据集的真实世界电子健康记录上验证了我们的架构,曲线下面积(AUC)达到了81%。我们的框架通过利用基于图的患者相似性在同一数据集上超越了ML和深度学习模型。此外,它提供了一种简单而有效的可解释性机制,有助于识别关键风险因素,如抗生素暴露时间延长、侵入性操作、合并感染和ICU住院时间延长,并发现具有临床意义的患者聚类。为了保证透明度,所有结果和代码可在https://github.com/oscarescuderoarnanz/DM4MTS获取。
本研究证明了患者相似性表示和基于图的方法在多重耐药性预测和可解释性方面的有效性。该方法增强了预测能力,识别了关键风险因素,并改善了患者分层,能够实现早期检测和靶向干预,突出了可解释ML在重症监护中的潜力。