开发人工智能模型，从日本电子健康记录中提取肿瘤学结局。

INTRODUCTION: A framework that extracts oncological outcomes from large-scale databases using artificial intelligence (AI) is not well established. Thus, we aimed to develop AI models to extract outcomes in patients with lung cancer using unstructured text data from electronic health records of multiple hospitals. METHODS: We constructed AI models (Bidirectional Encoder Representations from Transformers [BERT], Naïve Bayes, and Longformer) for tumor evaluation using the University of Miyazaki Hospital (UMH) database. This data included both structured and unstructured data from progress notes, radiology reports, and discharge summaries. The BERT model was applied to the Life Data Initiative (LDI) data set of six hospitals. Study outcomes included the performance of AI models and time to progression of disease (TTP) for each line of treatment based on the treatment response extracted by AI models. RESULTS: For the UMH data set, the BERT model exhibited higher precision accuracy compared to the Naïve Bayes or the Longformer models, respectively (precision [0.42 vs. 0.47 or 0.22], recall [0.63 vs. 0.46 or 0.33] and F1 scores [0.50 vs. 0.46 or 0.27]). When this BERT model was applied to LDI data, prediction accuracy remained quite similar. The Kaplan-Meier plots of TTP (months) showed similar trends for the first (median 14.9 [95% confidence interval 11.5, 21.1] and 16.8 [12.6, 21.8]), the second (7.8 [6.7, 10.7] and 7.8 [6.7, 10.7]), and the later lines of treatment for the predicted data by the BERT model and the manually curated data. CONCLUSION: We developed AI models to extract treatment responses in patients with lung cancer using a large EHR database; however, the model requires further improvement.

简介：利用人工智能（AI）从大规模数据库中提取肿瘤学结果的框架尚未建立。因此，我们旨在开发 AI 模型，以使用来自多家医院电子健康记录的非结构化文本数据提取肺癌患者的结果。

方法：我们使用宫崎大学医院（UMH）数据库构建了用于肿瘤评估的 AI 模型（双向编码器表示来自转换器[BERT]、朴素贝叶斯和 Longformer）。该数据包括来自进度记录、放射学报告和出院总结的结构化和非结构化数据。BERT 模型应用于六个医院的 Life Data Initiative（LDI）数据集。研究结果包括 AI 模型的性能和基于 AI 模型提取的治疗反应的每一线治疗的疾病进展时间（TTP）。

结果：对于 UMH 数据集，BERT 模型的精度准确性高于朴素贝叶斯或 Longformer 模型（精度[0.42 与 0.47 或 0.22]，召回率[0.63 与 0.46 或 0.33]和 F1 分数[0.50 与 0.46 或 0.27]）。当将此 BERT 模型应用于 LDI 数据时，预测准确性仍然相当相似。TTP（月）的 Kaplan-Meier 图显示了 BERT 模型预测数据和手动策管数据的第一（中位数 14.9 [95%置信区间 11.5, 21.1] 和 16.8 [12.6, 21.8]）、第二（7.8 [6.7, 10.7] 和 7.8 [6.7, 10.7]）和后续治疗线的相似趋势。

结论：我们开发了 AI 模型，以使用大型电子健康记录数据库提取肺癌患者的治疗反应；然而，该模型需要进一步改进。

新学期，新优惠

Suppr 超能文献

新学期，新优惠

Suppr 超能文献

Developing Artificial Intelligence Models for Extracting Oncologic Outcomes from Japanese Electronic Health Records.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

推荐工具