Bologheanu Razvan, Kapral Lorenz, Laxar Daniel, Maleczek Mathias, Dibiasi Christoph, Zeiner Sebastian, Agibetov Asan, Ercole Ari, Thoral Patrick, Elbers Paul, Heitzinger Clemens, Kimberger Oliver
Department of Anaesthesia, Intensive Care Medicine and Pain Medicine, Medical University of Vienna, 1090 Vienna, Austria.
Ludwig Boltzmann Institute for Digital Health and Patient Safety, 1090 Vienna, Austria.
J Clin Med. 2023 Feb 14;12(4):1513. doi: 10.3390/jcm12041513.
The optimal indication, dose, and timing of corticosteroids in sepsis is controversial. Here, we used reinforcement learning to derive the optimal steroid policy in septic patients based on data on 3051 ICU admissions from the AmsterdamUMCdb intensive care database.
We identified septic patients according to the 2016 consensus definition. An actor-critic RL algorithm using ICU mortality as a reward signal was developed to determine the optimal treatment policy from time-series data on 277 clinical parameters. We performed off-policy evaluation and testing in independent subsets to assess the algorithm's performance.
Agreement between the RL agent's policy and the actual documented treatment reached 59%. Our RL agent's treatment policy was more restrictive compared to the actual clinician behavior: our algorithm suggested withholding corticosteroids in 62% of the patient states, versus 52% according to the physicians' policy. The 95% lower bound of the expected reward was higher for the RL agent than clinicians' historical decisions. ICU mortality after concordant action in the testing dataset was lower both when corticosteroids had been withheld and when corticosteroids had been prescribed by the virtual agent. The most relevant variables were vital parameters and laboratory values, such as blood pressure, heart rate, leucocyte count, and glycemia.
Individualized use of corticosteroids in sepsis may result in a mortality benefit, but optimal treatment policy may be more restrictive than the routine clinical practice. Whilst external validation is needed, our study motivates a 'precision-medicine' approach to future prospective controlled trials and practice.
脓毒症中皮质类固醇的最佳适应症、剂量和使用时机存在争议。在此,我们基于阿姆斯特丹大学医学中心数据库中3051例重症监护病房入院患者的数据,运用强化学习得出脓毒症患者的最佳类固醇治疗策略。
我们根据2016年的共识定义确定脓毒症患者。开发了一种以重症监护病房死亡率作为奖励信号的演员-评论家强化学习算法,以从277个临床参数的时间序列数据中确定最佳治疗策略。我们在独立子集中进行了离策略评估和测试,以评估该算法的性能。
强化学习智能体的策略与实际记录的治疗之间的一致性达到59%。与实际临床医生的行为相比,我们的强化学习智能体的治疗策略更为严格:我们的算法建议在62%的患者状态下停用皮质类固醇,而根据医生的策略这一比例为52%。强化学习智能体的预期奖励的95%下限高于临床医生的历史决策。在测试数据集中,当停用皮质类固醇以及当虚拟智能体开具皮质类固醇时,一致行动后的重症监护病房死亡率均较低。最相关的变量是生命体征参数和实验室值,如血压、心率、白细胞计数和血糖。
脓毒症中皮质类固醇的个体化使用可能带来死亡率益处,但最佳治疗策略可能比常规临床实践更为严格。虽然需要外部验证,但我们的研究推动了未来前瞻性对照试验和实践采用“精准医学”方法。