Agius Rudi, Riis-Jensen Anders C, Wimmer Bettina, da Cunha-Bang Caspar, Murray Daniel Dawson, Poulsen Christian Bjorn, Bertelsen Marianne B, Schwartz Berit, Lundgren Jens Dilling, Langberg Henning, Niemann Carsten Utoft
Department of Hematology, Rigshospitalet, Copenhagen University Hospital, Copenhagen, Denmark.
SP Sundhedsdata, The Data Unit, Capital Region of Denmark, Copenhagen, Denmark.
NPJ Digit Med. 2024 Jun 5;7(1):147. doi: 10.1038/s41746-024-01132-6.
Research algorithms are seldom externally validated or integrated into clinical practice, leaving unknown challenges in deployment. In such efforts, one needs to address challenges related to data harmonization, the performance of an algorithm in unforeseen missingness, automation and monitoring of predictions, and legal frameworks. We here describe the deployment of a high-dimensional data-driven decision support model into an EHR and derive practical guidelines informed by this deployment that includes the necessary processes, stakeholders and design requirements for a successful deployment. For this, we describe our deployment of the chronic lymphocytic leukemia (CLL) treatment infection model (CLL-TIM) as a stand-alone platform adjoined to an EPIC-based Danish Electronic Health Record (EHR), with the presentation of personalized predictions in a clinical context. CLL-TIM is an 84-variable data-driven prognostic model utilizing 7-year medical patient records and predicts the 2-year risk composite outcome of infection and/or treatment post-CLL diagnosis. As an independent validation cohort for this deployment, we used a retrospective population-based cohort of patients diagnosed with CLL from 2018 onwards (n = 1480). Unexpectedly high levels of missingness for key CLL-TIM variables were exhibited upon deployment. High dimensionality, with the handling of missingness, and predictive confidence were critical design elements that enabled trustworthy predictions and thus serves as a priority for prognostic models seeking deployment in new EHRs. Our setup for deployment, including automation and monitoring into EHR that meets Medical Device Regulations, may be used as step-by-step guidelines for others aiming at designing and deploying research algorithms into clinical practice.
研究算法很少经过外部验证或整合到临床实践中,这在部署过程中留下了未知的挑战。在这类工作中,需要应对与数据协调、算法在意外数据缺失情况下的性能、预测的自动化和监测以及法律框架相关的挑战。我们在此描述将一个高维数据驱动的决策支持模型部署到电子健康记录(EHR)中的情况,并根据此次部署得出实用指南,其中包括成功部署所需的流程、利益相关者和设计要求。为此,我们将慢性淋巴细胞白血病(CLL)治疗感染模型(CLL-TIM)作为一个独立平台进行部署,该平台与基于EPIC的丹麦电子健康记录(EHR)相连,并在临床背景下呈现个性化预测。CLL-TIM是一个包含84个变量的数据驱动预后模型,利用7年的患者医疗记录,预测CLL诊断后2年感染和/或治疗的综合风险结果。作为此次部署的独立验证队列,我们使用了一个基于人群的回顾性队列,该队列中的患者自2018年起被诊断为CLL(n = 1480)。在部署时,关键CLL-TIM变量出现了意外的高缺失水平。高维度、缺失值处理以及预测置信度是关键的设计要素,这些要素能够实现可靠的预测,因此对于寻求在新的电子健康记录中部署的预后模型而言是优先考虑的因素。我们的部署设置,包括符合医疗器械法规的电子健康记录自动化和监测,可为其他旨在将研究算法设计和部署到临床实践中的人员提供分步指南。