Biomedical Informatics and Medical Education, University of Washington, Seattle, Washington, USA.
Molecular Engineering and Sciences Institute, University of Washington, Seattle, Washington, USA.
J Am Med Inform Assoc. 2020 Jul 1;27(9):1393-1400. doi: 10.1093/jamia/ocaa083.
The development of predictive models for clinical application requires the availability of electronic health record (EHR) data, which is complicated by patient privacy concerns. We showcase the "Model to Data" (MTD) approach as a new mechanism to make private clinical data available for the development of predictive models. Under this framework, we eliminate researchers' direct interaction with patient data by delivering containerized models to the EHR data.
We operationalize the MTD framework using the Synapse collaboration platform and an on-premises secure computing environment at the University of Washington hosting EHR data. Containerized mortality prediction models developed by a model developer, were delivered to the University of Washington via Synapse, where the models were trained and evaluated. Model performance metrics were returned to the model developer.
The model developer was able to develop 3 mortality prediction models under the MTD framework using simple demographic features (area under the receiver-operating characteristic curve [AUROC], 0.693), demographics and 5 common chronic diseases (AUROC, 0.861), and the 1000 most common features from the EHR's condition/procedure/drug domains (AUROC, 0.921).
We demonstrate the feasibility of the MTD framework to facilitate the development of predictive models on private EHR data, enabled by common data models and containerization software. We identify challenges that both the model developer and the health system information technology group encountered and propose future efforts to improve implementation.
The MTD framework lowers the barrier of access to EHR data and can accelerate the development and evaluation of clinical prediction models.
临床应用预测模型的开发需要电子健康记录(EHR)数据的支持,但这涉及到患者隐私问题。我们展示了“模型到数据”(MTD)方法,这是一种使私有临床数据可用于开发预测模型的新机制。在这个框架下,我们通过将容器化模型交付给 EHR 数据,消除了研究人员与患者数据的直接交互。
我们使用 Synapse 协作平台和华盛顿大学内部安全计算环境(托管 EHR 数据)来实现 MTD 框架。模型开发人员开发的基于容器的死亡率预测模型通过 Synapse 交付给华盛顿大学,在那里对模型进行训练和评估。模型性能指标被返回给模型开发人员。
模型开发人员能够在 MTD 框架下使用简单的人口统计学特征(接受者操作特征曲线下的面积 [AUROC],0.693)、人口统计学特征和 5 种常见慢性疾病(AUROC,0.861)以及 EHR 的病症/程序/药物领域的 1000 个最常见特征(AUROC,0.921)开发 3 个死亡率预测模型。
我们证明了 MTD 框架在使用通用数据模型和容器化软件促进私有 EHR 数据上预测模型的开发是可行的。我们确定了模型开发人员和健康系统信息技术组遇到的挑战,并提出了改进实施的未来努力。
MTD 框架降低了访问 EHR 数据的门槛,并可以加速临床预测模型的开发和评估。