Department of Clinical Pharmacy and Toxicology, Leiden University Medical Center, Leiden, The Netherlands.
Clin Pharmacol Ther. 2020 Sep;108(3):644-652. doi: 10.1002/cpt.1966. Epub 2020 Jul 18.
Real-world evidence can close the inferential gap between marketing authorization studies and clinical practice. However, the current standard for real-world data extraction from electronic health records (EHRs) for treatment evaluation is manual review (MR), which is time-consuming and laborious. Clinical Data Collector (CDC) is a novel natural language processing and text mining software tool for both structured and unstructured EHR data and only shows relevant EHR sections improving efficiency. We investigated CDC as a real-world data (RWD) collection method, through application of CDC queries for patient inclusion and information extraction on a cohort of patients with metastatic renal cell carcinoma (RCC) receiving systemic drug treatment. Baseline patient characteristics, disease characteristics, and treatment outcomes were extracted and these were compared with MR for validation. One hundred patients receiving 175 treatments were included using CDC, which corresponded to 99% with MR. Calculated median overall survival was 21.7 months (95% confidence interval (CI) 18.7-24.8) vs. 21.7 months (95% CI 18.6-24.8) and progression-free survival 8.9 months (95% CI 5.4-12.4) vs. 7.6 months (95% CI 5.7-9.4) for CDC vs. MR, respectively. Highest F1-score was found for cancer-related variables (88.1-100), followed by comorbidities (71.5-90.4) and adverse drug events (53.3-74.5), with most diverse scores on international metastatic RCC database criteria (51.4-100). Mean data collection time was 12 minutes (CDC) vs. 86 minutes (MR). In conclusion, CDC is a promising tool for retrieving RWD from EHRs because the correct patient population can be identified as well as relevant outcome data, such as overall survival and progression-free survival.
真实世界证据可以缩小药品上市许可研究与临床实践之间的推论差距。然而,目前用于从电子健康记录(EHR)中提取治疗评估用真实世界数据的标准是手动审查(MR),这种方法既耗时又费力。临床数据采集器(CDC)是一种新颖的自然语言处理和文本挖掘软件工具,可用于处理结构化和非结构化的 EHR 数据,仅显示相关的 EHR 部分,从而提高了效率。我们研究了 CDC 作为一种真实世界数据(RWD)采集方法,通过在接受系统药物治疗的转移性肾细胞癌(RCC)患者队列中应用 CDC 查询进行患者纳入和信息提取。提取了基线患者特征、疾病特征和治疗结局,并与 MR 进行了验证比较。使用 CDC 纳入了 100 名接受 175 种治疗的患者,与 MR 对应率为 99%。计算得出的中位总生存期为 21.7 个月(95%置信区间[CI]18.7-24.8)与 21.7 个月(95% CI 18.6-24.8),无进展生存期为 8.9 个月(95% CI 5.4-12.4)与 7.6 个月(95% CI 5.7-9.4),分别为 CDC 与 MR 比较的结果。癌症相关变量的最高 F1 得分为 88.1-100,其次是合并症(71.5-90.4)和药物不良事件(53.3-74.5),国际转移性 RCC 数据库标准的评分差异最大(51.4-100)。平均数据采集时间为 12 分钟(CDC)与 86 分钟(MR)。总之,CDC 是从 EHR 中检索 RWD 的有前途的工具,因为可以正确识别患者人群以及总生存期和无进展生存期等相关结局数据。