Department of Information Systems, Oscar Lambret Cancer Center, Lille, France.
Department of Medical Oncology, Oscar Lambret Cancer Center, Lille, France.
JCO Clin Cancer Inform. 2023 Jan;7:e2200139. doi: 10.1200/CCI.22.00139.
Imaging reports in oncology provide critical information about the disease evolution that should be timely shared to tailor the clinical decision making and care coordination of patients with advanced cancer. However, tumor response stays unstructured in free-text and underexploited. Natural language processing (NLP) methods can help provide this critical information into the electronic health records (EHR) in real time to assist health care workers.
A rule-based algorithm was developed using SAS tools to automatically extract and categorize tumor response within progression or no progression categories. 2,970 magnetic resonance imaging, computed tomography scan, and positron emission tomography French reports were extracted from the EHR of a large comprehensive cancer center to build a 2,637-document training set and a 603-document validation set. The model was also tested on 189 imaging reports from 46 different radiology centers. A tumor dashboard was created in the EHR using the Timeline tool of the vis.js javascript library.
An NLP methodology was applied to create an ontology of radiographic terms defining tumor response, mapping text to five main concepts, and application decision rules on the basis of clinical practice RECIST guidelines. The model achieved an overall accuracy of 0.88 (ranging from 0.87 to 0.94), with similar performance on both progression and no progression classification. The overall accuracy was 0.82 on reports from different radiology centers. Data were visualized and organized in a dynamic tumor response timeline. This tool was deployed successfully at our institution both retrospectively and prospectively as part of an automatic pipeline to screen reports and classify tumor response in real time for all metastatic patients.
Our approach provides an NLP-based framework to structure and classify tumor response from the EHR and integrate tumor response classification into the clinical oncology workflow.
肿瘤学的影像学报告提供了关于疾病演变的关键信息,这些信息应及时共享,以调整晚期癌症患者的临床决策和护理协调。然而,肿瘤反应仍以自由文本的形式呈现,且未得到充分利用。自然语言处理(NLP)方法可以帮助将这些关键信息实时纳入电子健康记录(EHR),以协助医疗保健工作者。
使用 SAS 工具开发了一个基于规则的算法,以自动提取和分类进展或无进展类别中的肿瘤反应。从一家大型综合性癌症中心的 EHR 中提取了 2970 份磁共振成像、计算机断层扫描和正电子发射断层扫描的法国报告,构建了一个 2637 篇文档的训练集和 603 篇文档的验证集。该模型还在来自 46 个不同放射学中心的 189 份影像学报告上进行了测试。使用 vis.js JavaScript 库的 Timeline 工具在 EHR 中创建了一个肿瘤仪表板。
应用 NLP 方法创建了一个放射性术语本体,定义了肿瘤反应,将文本映射到五个主要概念,并根据临床实践 RECIST 指南应用决策规则。该模型的整体准确性为 0.88(范围为 0.87 至 0.94),在进展和无进展分类方面具有相似的性能。来自不同放射学中心的报告的总体准确性为 0.82。数据以动态肿瘤反应时间线的形式可视化和组织。该工具已成功部署在我们的机构中,无论是回顾性还是前瞻性,作为自动筛选报告和实时分类肿瘤反应的自动管道的一部分,用于所有转移性患者。
我们的方法提供了一种基于 NLP 的框架,用于从 EHR 中构建和分类肿瘤反应,并将肿瘤反应分类整合到临床肿瘤学工作流程中。