• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

自然语言处理在电子病历中识别癌症治疗方法

Natural Language Processing to Identify Cancer Treatments With Electronic Medical Records.

机构信息

Department of Management Science and Engineering, Huang Engineering Center, Stanford, CA.

Department of Biomedical Informatics, Department of Radiology, Emory University School of Medicine, Atlanta, GA.

出版信息

JCO Clin Cancer Inform. 2021 Apr;5:379-393. doi: 10.1200/CCI.20.00173.

DOI:10.1200/CCI.20.00173
PMID:33822653
Abstract

PURPOSE

Knowing the treatments administered to patients with cancer is important for treatment planning and correlating treatment patterns with outcomes for personalized medicine study. However, existing methods to identify treatments are often lacking. We develop a natural language processing approach with structured electronic medical records and unstructured clinical notes to identify the initial treatment administered to patients with cancer.

METHODS

We used a total number of 4,412 patients with 483,782 clinical notes from the Stanford Cancer Institute Research Database containing patients with nonmetastatic prostate, oropharynx, and esophagus cancer. We trained treatment identification models for each cancer type separately and compared performance of using only structured, only unstructured (, , ), and combinations of both (, , ). We optimized the identification model among five machine learning methods (logistic regression, multilayer perceptrons, random forest, support vector machines, and stochastic gradient boosting). The treatment information recorded in the cancer registry is the gold standard and compares our methods to an identification baseline with billing codes.

RESULTS

For prostate cancer, we achieved an f1-score of 0.99 (95% CI, 0.97 to 1.00) for radiation and 1.00 (95% CI, 0.99 to 1.00) for surgery using . For oropharynx cancer, we achieved an f1-score of 0.78 (95% CI, 0.58 to 0.93) for chemoradiation and 0.83 (95% CI, 0.69 to 0.95) for surgery using . For esophagus cancer, we achieved an f1-score of 1.0 (95% CI, 1.0 to 1.0) for both chemoradiation and surgery using all combinations of structured and unstructured data. We found that employing the free-text clinical notes outperforms using the billing codes or only structured data for all three cancer types.

CONCLUSION

Our results show that treatment identification using free-text clinical notes greatly improves upon the performance using billing codes and simple structured data. The approach can be used for treatment cohort identification and adapted for longitudinal cancer treatment identification.

摘要

目的

了解癌症患者的治疗方法对于治疗计划和将治疗模式与个性化医学研究的结果相关联非常重要。然而,现有的治疗方法往往存在不足。我们开发了一种自然语言处理方法,结合结构化电子病历和非结构化临床记录,以确定癌症患者的初始治疗方法。

方法

我们使用了来自斯坦福癌症研究所研究数据库的 4412 名患者和 483782 份临床记录,这些患者患有非转移性前列腺癌、口咽癌和食管癌。我们分别为每种癌症类型训练了治疗识别模型,并比较了仅使用结构化数据、仅使用非结构化数据(、、)以及两者结合(、、)的性能。我们在五种机器学习方法(逻辑回归、多层感知机、随机森林、支持向量机和随机梯度提升)中对识别模型进行了优化。癌症登记处记录的治疗信息是金标准,将我们的方法与使用计费代码的识别基线进行了比较。

结果

对于前列腺癌,我们使用 分别获得了放射治疗的 f1 得分为 0.99(95%置信区间,0.97 至 1.00)和手术的 1.00(95%置信区间,0.99 至 1.00)。对于口咽癌,我们使用 分别获得了放化疗的 f1 得分为 0.78(95%置信区间,0.58 至 0.93)和手术的 0.83(95%置信区间,0.69 至 0.95)。对于食管癌,我们使用所有结构化和非结构化数据的组合,分别获得了放化疗和手术的 f1 得分为 1.0(95%置信区间,1.0 至 1.0)。我们发现,对于所有三种癌症类型,使用自由文本临床记录比使用计费代码或仅使用结构化数据的效果更好。

结论

我们的研究结果表明,使用自由文本临床记录进行治疗识别可以大大提高使用计费代码和简单结构化数据的性能。该方法可用于治疗队列识别,并可适用于纵向癌症治疗识别。

相似文献

1
Natural Language Processing to Identify Cancer Treatments With Electronic Medical Records.自然语言处理在电子病历中识别癌症治疗方法
JCO Clin Cancer Inform. 2021 Apr;5:379-393. doi: 10.1200/CCI.20.00173.
2
Looking for low vision: Predicting visual prognosis by fusing structured and free-text data from electronic health records.寻找低视力:通过融合电子健康记录中的结构化和自由文本数据来预测视觉预后。
Int J Med Inform. 2022 Mar;159:104678. doi: 10.1016/j.ijmedinf.2021.104678. Epub 2021 Dec 30.
3
Predicting near-term glaucoma progression: An artificial intelligence approach using clinical free-text notes and data from electronic health records.预测近期青光眼进展:一种使用临床自由文本记录和电子健康记录数据的人工智能方法。
Front Med (Lausanne). 2023 Apr 13;10:1157016. doi: 10.3389/fmed.2023.1157016. eCollection 2023.
4
Supervised Text Classification System Detects Fontan Patients in Electronic Records With Higher Accuracy Than Codes.监督式文本分类系统在电子病历中的 Fontan 患者检测准确率高于编码。
J Am Heart Assoc. 2023 Jul 4;12(13):e030046. doi: 10.1161/JAHA.123.030046. Epub 2023 Jun 22.
5
Classifying early infant feeding status from clinical notes using natural language processing and machine learning.使用自然语言处理和机器学习对临床记录进行早期婴儿喂养状态分类。
Sci Rep. 2024 Apr 3;14(1):7831. doi: 10.1038/s41598-024-58299-x.
6
Deep Learning Approaches for Predicting Glaucoma Progression Using Electronic Health Records and Natural Language Processing.使用电子健康记录和自然语言处理的深度学习方法预测青光眼进展
Ophthalmol Sci. 2022 Feb 12;2(2):100127. doi: 10.1016/j.xops.2022.100127. eCollection 2022 Jun.
7
Diagnosing post-traumatic stress disorder using electronic medical record data.使用电子病历数据诊断创伤后应激障碍。
Health Informatics J. 2021 Oct-Dec;27(4):14604582211053259. doi: 10.1177/14604582211053259.
8
Predicting seizure recurrence after an initial seizure-like episode from routine clinical notes using large language models: a retrospective cohort study.使用大型语言模型从常规临床记录预测首次类似癫痫发作后的癫痫复发:一项回顾性队列研究。
Lancet Digit Health. 2023 Dec;5(12):e882-e894. doi: 10.1016/S2589-7500(23)00179-6.
9
Artificial Intelligence Learning Semantics via External Resources for Classifying Diagnosis Codes in Discharge Notes.人工智能通过外部资源学习语义以对出院小结中的诊断代码进行分类。
J Med Internet Res. 2017 Nov 6;19(11):e380. doi: 10.2196/jmir.8344.
10
Automated feature selection of predictors in electronic medical records data.电子病历数据中预测指标的自动特征选择
Biometrics. 2019 Mar;75(1):268-277. doi: 10.1111/biom.12987. Epub 2019 Apr 2.

引用本文的文献

1
An Extraction Tool for Venous Thromboembolism Symptom Identification in Primary Care Notes to Facilitate Electronic Clinical Quality Measure Reporting: Algorithm Development and Validation Study.一种用于在初级保健记录中识别静脉血栓栓塞症状以促进电子临床质量指标报告的提取工具:算法开发与验证研究
JMIR Med Inform. 2025 Aug 26;13:e63720. doi: 10.2196/63720.
2
Open-Source Hybrid Large Language Model Integrated System for Extraction of Breast Cancer Treatment Pathway From Free-Text Clinical Notes.用于从自由文本临床记录中提取乳腺癌治疗路径的开源混合大语言模型集成系统
JCO Clin Cancer Inform. 2025 Jun;9:e2500002. doi: 10.1200/CCI-25-00002. Epub 2025 Jun 27.
3
Classifying Stereotactic Radiosurgery Patients by Primary Diagnosis Using Natural Language Processing of Clinical Notes.
利用临床记录的自然语言处理按初始诊断对立体定向放射外科患者进行分类。
JCO Clin Cancer Inform. 2025 Jun;9:e2400268. doi: 10.1200/CCI-24-00268. Epub 2025 Jun 13.
4
From manual clinical criteria to machine learning algorithms: Comparing outcome endpoints derived from diverse electronic health record data modalities.从手动临床标准到机器学习算法:比较源自不同电子健康记录数据模式的结局终点。
PLOS Digit Health. 2025 May 14;4(5):e0000755. doi: 10.1371/journal.pdig.0000755. eCollection 2025 May.
5
Applications of Natural Language Processing in Otolaryngology: A Scoping Review.自然语言处理在耳鼻咽喉科的应用:一项范围综述
Laryngoscope. 2025 Sep;135(9):3049-3063. doi: 10.1002/lary.32198. Epub 2025 May 1.
6
NRG Oncology Assessment of Artificial Intelligence for Automatic Treatment Planning in Radiation Therapy Clinical Trials: Present and Future.NRG肿瘤学对人工智能在放射治疗临床试验自动治疗计划中的评估:现状与未来。
Int J Radiat Oncol Biol Phys. 2025 Mar 29. doi: 10.1016/j.ijrobp.2025.03.045.
7
Automated Identification of Breast Cancer Relapse in Computed Tomography Reports Using Natural Language Processing.使用自然语言处理技术在计算机断层扫描报告中自动识别乳腺癌复发情况
JCO Clin Cancer Inform. 2024 Dec;8:e2400107. doi: 10.1200/CCI.24.00107. Epub 2024 Dec 20.
8
Extraction of Unstructured Electronic Health Records to Evaluate Glioblastoma Treatment Patterns.从非结构化的电子健康记录中提取信息,以评估胶质母细胞瘤的治疗模式。
JCO Clin Cancer Inform. 2024 Jun;8:e2300091. doi: 10.1200/CCI.23.00091.
9
DeepPhe-CR: Natural Language Processing Software Services for Cancer Registrar Case Abstraction.DeepPhe-CR:用于癌症登记员病例提取的自然语言处理软件服务。
JCO Clin Cancer Inform. 2023 Sep;7:e2300156. doi: 10.1200/CCI.23.00156.
10
The Use of Natural Language Processing for Computer-Aided Diagnostics and Monitoring of Body Image Perception in Patients with Cancers.自然语言处理在癌症患者身体形象感知的计算机辅助诊断与监测中的应用
Cancers (Basel). 2023 Nov 16;15(22):5437. doi: 10.3390/cancers15225437.