Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY.
Division of Research, Kaiser Permanente Northern California, Oakland, CA.
JCO Clin Cancer Inform. 2024 Apr;8:e2300209. doi: 10.1200/CCI.23.00209.
Identification of patients' intended chemotherapy regimens is critical to most research questions conducted in the real-world setting of cancer care. Yet, these data are not routinely available in electronic health records (EHRs) at the specificity required to address these questions. We developed a methodology to identify patients' intended regimens from EHR data in the Optimal Breast Cancer Chemotherapy Dosing (OBCD) study.
In women older than 18 years, diagnosed with primary stage I-IIIA breast cancer at Kaiser Permanente Northern California (2006-2019), we categorized participants into 24 drug combinations described in National Comprehensive Cancer Network guidelines for breast cancer treatment. Participants were categorized into 50 guideline chemotherapy administration schedules within these combinations using an iterative algorithm process, followed by chart abstraction where necessary. We also identified patients intended to receive nonguideline administration schedules within guideline drug combinations and nonguideline drug combinations. This process was adapted at Kaiser Permanente Washington using abstracted data (2004-2015).
In the OBCD cohort, 13,231 women received adjuvant or neoadjuvant chemotherapy, of whom 10,213 (77%) had their intended regimen identified via the algorithm, 2,416 (18%) had their intended regimen identified via abstraction, and 602 (4.5%) could not be identified. Across guideline drug combinations, 111 nonguideline dosing schedules were used, alongside 61 nonguideline drug combinations. A number of factors were associated with requiring abstraction for regimen determination, including: decreasing neighborhood household income, earlier diagnosis year, later stage, nodal status, and human epidermal growth factor receptor 2 (HER2)+ status.
We describe the challenges and approaches to operationalize complex, real-world data to identify intended chemotherapy regimens in large, observational studies. This methodology can improve efficiency of use of large-scale clinical data in real-world populations, helping answer critical questions to improve care delivery and patient outcomes.
在癌症护理的真实环境中进行的大多数研究问题都需要确定患者的预期化疗方案,而电子健康记录(EHR)中通常无法提供满足这些问题所需的特异性数据。我们开发了一种从 Optimal Breast Cancer Chemotherapy Dosing(OBCD)研究的 EHR 数据中识别患者预期方案的方法。
在 Kaiser Permanente Northern California(2006-2019 年)接受诊断的年龄大于 18 岁的原发性 I-IIIA 期乳腺癌女性中,我们将参与者分为 National Comprehensive Cancer Network 乳腺癌治疗指南中描述的 24 种药物组合。使用迭代算法过程,将参与者归入这些组合中的 50 种指南化疗给药方案中,必要时进行图表摘要。我们还确定了在指南药物组合内接受非指南给药方案和非指南药物组合的患者。Kaiser Permanente Washington 使用摘要数据(2004-2015 年)对该过程进行了改编。
在 OBCD 队列中,13231 名女性接受了辅助或新辅助化疗,其中 10213 名(77%)通过算法确定了其预期方案,2416 名(18%)通过摘要确定了其预期方案,602 名(4.5%)无法确定。在指南药物组合中,使用了 111 种非指南剂量方案,以及 61 种非指南药物组合。一些因素与需要摘要确定方案有关,包括:邻里家庭收入减少、诊断年份较早、分期较晚、淋巴结状态和人表皮生长因子受体 2(HER2)+状态。
我们描述了在大型观察性研究中操作复杂的真实世界数据以确定预期化疗方案所面临的挑战和方法。这种方法可以提高真实人群中使用大规模临床数据的效率,有助于回答关键问题,以改善护理提供和患者结局。