Levine Mark N, Alexander Gordon, Sathiyapalan Arani, Agrawal Anjali, Pond Greg
McMaster University, Hamilton, Ontario, Canada.
Escarpment Cancer Research Institute, Hamilton, Ontario, Canada.
JCO Clin Cancer Inform. 2019 Aug;3:1-11. doi: 10.1200/CCI.19.00032.
Clinicians need accurate and timely information on the impact of treatments on patient outcomes. The electronic health record (EHR) offers the potential for insight into real-world patient experiences and outcomes, but it is difficult to tap into. Our goal was to apply artificial intelligence technology to the EHR to characterize the clinical course of patients with stage III breast cancer.
Data from patients with stage III breast cancer who presented between 2013 and 2015 were extracted from the EHR, de-identified, and imported into the IBM Cloud. Specialized natural language processing (NLP) annotators were developed to extract medical concepts from unstructured clinical text and transform them to structured attributes. In the validation phase, these annotators were applied to 19 additional patients with stage III breast cancer from the same period. The resulting data were compared with that in the medical chart (gold standard) for nine key indicators.
Information was extracted for 50 patients, including tumor stage (94% stage IIIA, 6% stage IIIB), age (28% 50 years or younger, 52% between 51 and 70 years, and 24% older than 70 years), receptor status (84% estrogen receptor positive, 74% progesterone receptor positive), and first treatment (72% surgery, 26% chemotherapy, 2% endocrine). Events in the patient's journey were compiled to create a timeline. For 171 data elements, NLP and the chart disagreed for 41 (24%; 95% CI, 17.8% to 31.1%). With additional manipulation using simple logic, the disagreement was reduced to six elements (3.5%; 95% CI, 1.3% to 7.5%; F1 statistic, 0.9694).
It is possible to extract, read, and combine data from the EHR to view the patient journey. The agreement between NLP and the gold standard was high, which supports validity.
临床医生需要关于治疗对患者预后影响的准确且及时的信息。电子健康记录(EHR)提供了洞察真实世界患者经历和预后的潜力,但难以利用。我们的目标是将人工智能技术应用于电子健康记录,以描述III期乳腺癌患者的临床病程。
从电子健康记录中提取2013年至2015年间就诊的III期乳腺癌患者的数据,进行去标识化处理,然后导入IBM云。开发了专门的自然语言处理(NLP)注释器,以从非结构化临床文本中提取医学概念并将其转换为结构化属性。在验证阶段,将这些注释器应用于同期另外19例III期乳腺癌患者。将所得数据与病历(金标准)中的九个关键指标进行比较。
提取了50例患者的信息,包括肿瘤分期(94%为IIIA期,6%为IIIB期)、年龄(28%为50岁及以下,52%在51至70岁之间,24%超过70岁)、受体状态(84%雌激素受体阳性,74%孕激素受体阳性)以及首次治疗(72%为手术,26%为化疗,2%为内分泌治疗)。汇总患者病程中的事件以创建时间线。对于171个数据元素,NLP与病历不一致的有41个(24%;95%CI,17.8%至31.1%)。通过使用简单逻辑进行额外处理,不一致情况减少到六个元素(3.5%;95%CI,1.3%至7.5%;F1统计量,0.9694)。
从电子健康记录中提取、读取和组合数据以查看患者病程是可行的。NLP与金标准之间的一致性很高,这支持了有效性。