Appelbaum Limor, Kaplan Irving D., Palchuk Matvey B., Kundrot Steven, Winer-Jones Jessamine P., Rinard Martin
Department of Radiation Oncology, Beth Israel Deaconess Medical Center, Boston, MA, USA
TriNetX, LLC, Cambridge, MA, USA
Early diagnosis is critical to improving survival rates of lethal cancers, such as pancreatic duct adenocarcinoma (PDAC). However, there are no reliable screening test for these cancers. In this chapter, we present potential methods for predicting early, evolving cancers by leveraging readily available electronic health record (EHR) data and machine learning. We discuss the various aspects of our collaborative experience, involving clinical and computer scientists, in navigating the process of using EHRs to develop cancer risk prediction models. This chapter is intended to serve as a guide to others preforming this type of research. We cover the different steps involved, based on our initial experience of model development using single-institution data, including data acquisition, querying and downloading data, protecting patient confidentiality, data curation, model development, and validation. Challenges encountered when using single-institution data is presented, along with lessons learned. Drawing from our experience working with a federated database of EHR data from multiple institutions to develop a risk prediction model for PDAC, we also discuss how many of these challenges can be addressed by using such a federated database of EHR data. We also discuss future clinical opportunities that may arise from leveraging data from a federated network, such as the deployment of risk models for clinical studies.
早期诊断对于提高胰腺癌等致命癌症的生存率至关重要。然而,目前尚无针对这些癌症的可靠筛查测试。在本章中,我们介绍了通过利用现成的电子健康记录(EHR)数据和机器学习来预测早期进展性癌症的潜在方法。我们讨论了我们(涉及临床和计算机科学家)在利用电子健康记录开发癌症风险预测模型过程中的协作经验的各个方面。本章旨在为其他进行此类研究的人员提供指导。基于我们使用单机构数据进行模型开发的初步经验,我们涵盖了所涉及的不同步骤,包括数据采集、查询和下载数据、保护患者隐私、数据整理、模型开发和验证。介绍了使用单机构数据时遇到的挑战以及汲取的经验教训。借鉴我们使用来自多个机构的电子健康记录联合数据库开发胰腺癌风险预测模型的经验,我们还讨论了如何通过使用这样的电子健康记录联合数据库来应对其中的许多挑战。我们还讨论了利用联合网络数据可能带来的未来临床机遇,例如为临床研究部署风险模型。