IEEE J Biomed Health Inform. 2022 Oct;26(10):4880-4891. doi: 10.1109/JBHI.2022.3192021. Epub 2022 Oct 4.
Dry weight (DW), defined as the lowest tolerated postdialysis weight following the ultrafiltration (UF) of excess fluid volume, is essential for any dialysis prescription for hemodialysis (HD) patients. However, there is no gold standard for DW assessment, and the difficulty of its accurate assessment increases given individual variations and the dynamic changes caused by the uncertainty of patients' condition. Therefore, the current empirical evaluation process is often crude, imprecise, experience-dependent, and energy-consuming. Here, we highlight the personalized dynamic changes in DW over time rather than the more accurate DW assessments at some point in time and formulate the DW evaluation problem into a sequential decision-making process using the Markov decision process (MDP) framework. A reinforcement learning (RL) algorithm based on a dueling double deep Q-network (Duel-DDQN) is proposed to optimize the DW assessment policy, and a multifaceted inspection is applied to assess policy effectiveness and safety. We utilize ten years of data from the Kidney Disease Center, enrolling 750 HD patients and 243,287 dialysis sessions. Good model calibration is confirmed, and off-policy evaluation demonstrates that our policy outperforms other policies, suggesting a decrease of 7.71% in the expected 5-year mortality rate and of 13.44% in the incidence of intradialytic symptoms compared with those of clinicians' strategy. The RL policy adjusts DW more frequently, responds to DW changes more actively, and observes a larger feature space. It is hoped that the proposed solution will help clinicians assess and monitor DW dynamically, making the estimation process more refined, personalized, and intelligent.
干体重(DW)定义为超滤(UF)去除多余体液量后患者能够耐受的最低透析后体重,对于血液透析(HD)患者的任何透析处方都至关重要。然而,目前还没有 DW 评估的金标准,而且由于个体差异以及患者病情不确定性导致的动态变化,使得 DW 准确评估的难度增加。因此,目前的经验评估过程往往比较粗糙、不精确、依赖经验且耗费精力。在这里,我们强调的是 DW 随时间的个性化动态变化,而不是更准确的 DW 评估在某个时间点上的评估。我们使用马尔可夫决策过程(MDP)框架将 DW 评估问题构造成一个序贯决策过程。提出了一种基于对偶双深度 Q 网络(Duel-DDQN)的强化学习(RL)算法来优化 DW 评估策略,并应用多方面的检查来评估策略的有效性和安全性。我们利用肾脏病中心十年来的数据,共纳入 750 名 HD 患者和 243287 次透析治疗。模型校准良好,离线评估表明,我们的策略优于其他策略,预计 5 年内死亡率降低 7.71%,透析中症状发生率降低 13.44%,与临床医生的策略相比。RL 策略更频繁地调整 DW,更积极地响应 DW 变化,并观察到更大的特征空间。我们希望所提出的解决方案能够帮助临床医生动态评估和监测 DW,使估计过程更加精细、个性化和智能化。