Drudi Cristian, Mollura Maximiliano, Lehman Li-Wei H, Barbieri Riccardo
Department of Electronics, Informatics and EngineeringPolitecnico di Milano 20133 Milano Italy.
Institute for Medical Engineering and ScienceMassachusetts Institute of Technology Cambridge MA 02139 USA.
IEEE Open J Eng Med Biol. 2024 Feb 19;5:806-815. doi: 10.1109/OJEMB.2024.3367236. eCollection 2024.
The purpose of this study is to evaluate the importance of cardiorespiratory variables within a Reinforcement Learning (RL) recommendation system aimed at establishing optimal strategies for drug treatment of septic patients in the intensive care unit (ICU). We developed a RL model in order to establish drug administration strategies for septic patients using only a set of cardiorespiratory variables. We then compared this model with other RL models trained with a different set of features. We selected patients meeting the Sepsis-3 criteria from the Multi-parameter Intelligent Monitoring in Intensive Care (MIMIC III) database, resulting in a total of 20,496 ICU admissions. A Markov Decision Process (MDP) was built on the extracted discrete time-series. A policy iteration algorithm was used to obtain the optimal AI policy for the MDP. The policy performance was then evaluated using the WIS estimator. The process was repeated for each set of variables and compared to a set of baseline benchmark policies. The model trained with cardiorespiratory variables outperformed all other models considered, resulting in a 95% confidence lower bound score of 97.48. This finding highlights the importance of cardiovascular variables in the clinical RL recommendation system. We established an efficient RL model for sepsis treatment in the ICU and demonstrated that cardiorespiratory variables provides critical information in devising optimal policies. Given the potentially continuous availability of cardiorespiratory features extracted from bedside physiological waveform monitoring, the proposed framework paves the way for a real time recommendation system for sepsis treatment.
本研究的目的是评估强化学习(RL)推荐系统中心肺变量的重要性,该系统旨在为重症监护病房(ICU)的脓毒症患者建立最佳药物治疗策略。我们开发了一个RL模型,以便仅使用一组心肺变量为脓毒症患者建立给药策略。然后,我们将该模型与使用不同特征集训练的其他RL模型进行比较。我们从多参数智能重症监护监测(MIMIC III)数据库中选择符合脓毒症-3标准的患者,共有20496例ICU入院病例。基于提取的离散时间序列构建了马尔可夫决策过程(MDP)。使用策略迭代算法获得MDP的最优AI策略。然后使用WIS估计器评估策略性能。对每组变量重复该过程,并与一组基线基准策略进行比较。使用心肺变量训练的模型优于所有其他考虑的模型,95%置信下限评分为97.48。这一发现突出了心血管变量在临床RL推荐系统中的重要性。我们建立了一个用于ICU脓毒症治疗的高效RL模型,并证明心肺变量在制定最优策略时提供了关键信息。鉴于从床边生理波形监测中提取的心肺特征可能持续可用,所提出的框架为脓毒症治疗的实时推荐系统铺平了道路。