• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

自动化自然语言处理提取乳腺癌治疗停药的临床理由。

Automated NLP Extraction of Clinical Rationale for Treatment Discontinuation in Breast Cancer.

机构信息

CSAIL & IMES, Massachusetts Institute of Technology, Cambridge, MA.

Harvard Medical School, Boston, MA.

出版信息

JCO Clin Cancer Inform. 2021 May;5:550-560. doi: 10.1200/CCI.20.00139.

DOI:10.1200/CCI.20.00139
PMID:33989016
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8462597/
Abstract

PURPOSE

Key oncology end points are not routinely encoded into electronic medical records (EMRs). We assessed whether natural language processing (NLP) can abstract treatment discontinuation rationale from unstructured EMR notes to estimate toxicity incidence and progression-free survival (PFS).

METHODS

We constructed a retrospective cohort of 6,115 patients with early-stage and 701 patients with metastatic breast cancer initiating care at Memorial Sloan Kettering Cancer Center from 2008 to 2019. Each cohort was divided into training (70%), validation (15%), and test (15%) subsets. Human abstractors identified the clinical rationale associated with treatment discontinuation events. Concatenated EMR notes were used to train high-dimensional logistic regression and convolutional neural network models. Kaplan-Meier analyses were used to compare toxicity incidence and PFS estimated by our NLP models to estimates generated by manual labeling and time-to-treatment discontinuation (TTD).

RESULTS

Our best high-dimensional logistic regression models identified toxicity events in early-stage patients with an area under the curve of the receiver-operator characteristic of 0.857 ± 0.014 (standard deviation) and progression events in metastatic patients with an area under the curve of 0.752 ± 0.027 (standard deviation). NLP-extracted toxicity incidence and PFS curves were not significantly different from manually extracted curves ( = .95 and = .67, respectively). By contrast, TTD overestimated toxicity in early-stage patients ( < .001) and underestimated PFS in metastatic patients ( < .001). Additionally, we tested an extrapolation approach in which 20% of the metastatic cohort were labeled manually, and NLP algorithms were used to abstract the remaining 80%. This extrapolated outcomes approach resolved PFS differences between receptor subtypes ( < .001 for hormone receptor+/human epidermal growth factor receptor 2- human epidermal growth factor receptor 2+ triple-negative) that could not be resolved with TTD.

CONCLUSION

NLP models are capable of abstracting treatment discontinuation rationale with minimal manual labeling.

摘要

目的

关键的肿瘤学终点通常不会被编码到电子病历(EMR)中。我们评估了自然语言处理(NLP)是否可以从非结构化的 EMR 记录中提取治疗中断的基本原理,以估计毒性发生率和无进展生存期(PFS)。

方法

我们构建了一个回顾性队列,包括 2008 年至 2019 年在纪念斯隆凯特琳癌症中心接受治疗的 6115 例早期乳腺癌患者和 701 例转移性乳腺癌患者。每个队列分为训练(70%)、验证(15%)和测试(15%)子集。人工摘要员确定与治疗中断事件相关的临床基本原理。串联 EMR 记录用于训练高维逻辑回归和卷积神经网络模型。Kaplan-Meier 分析用于比较我们的 NLP 模型估计的毒性发生率和 PFS 与手动标记和治疗中断时间(TTD)生成的估计值。

结果

我们最好的高维逻辑回归模型在早期患者中识别出毒性事件的曲线下面积为 0.857±0.014(标准差),在转移性患者中识别出进展事件的曲线下面积为 0.752±0.027(标准差)。NLP 提取的毒性发生率和 PFS 曲线与手动提取的曲线没有显著差异(=0.95 和=0.67)。相比之下,TTD 高估了早期患者的毒性(<0.001),低估了转移性患者的 PFS(<0.001)。此外,我们测试了一种外推方法,其中转移性队列的 20%手动标记,其余 80%使用 NLP 算法提取。这种外推方法解决了 TTD 无法解决的受体亚型之间的 PFS 差异(激素受体+/人表皮生长因子受体 2-人表皮生长因子受体 2+三阴性受体<0.001)。

结论

NLP 模型能够在最小的人工标记下提取治疗中断的基本原理。

相似文献

1
Automated NLP Extraction of Clinical Rationale for Treatment Discontinuation in Breast Cancer.自动化自然语言处理提取乳腺癌治疗停药的临床理由。
JCO Clin Cancer Inform. 2021 May;5:550-560. doi: 10.1200/CCI.20.00139.
2
Machine learning and natural language processing (NLP) approach to predict early progression to first-line treatment in real-world hormone receptor-positive (HR+)/HER2-negative advanced breast cancer patients.机器学习和自然语言处理(NLP)方法预测激素受体阳性(HR+)/HER2 阴性晚期乳腺癌患者一线治疗的早期进展。
Eur J Cancer. 2021 Feb;144:224-231. doi: 10.1016/j.ejca.2020.11.030. Epub 2020 Dec 26.
3
Extraction and Imputation of Eastern Cooperative Oncology Group Performance Status From Unstructured Oncology Notes Using Language Models.使用语言模型从非结构化肿瘤学记录中提取和插补东部肿瘤协作组表现状态。
JCO Clin Cancer Inform. 2024 May;8:e2300269. doi: 10.1200/CCI.23.00269.
4
Natural Language Processing to Ascertain Cancer Outcomes From Medical Oncologist Notes.自然语言处理从肿瘤医生的病历中确定癌症结果。
JCO Clin Cancer Inform. 2020 Aug;4:680-690. doi: 10.1200/CCI.20.00020.
5
Natural Language Processing Approaches to Detect the Timeline of Metastatic Recurrence of Breast Cancer.用于检测乳腺癌转移复发时间线的自然语言处理方法
JCO Clin Cancer Inform. 2019 Oct;3:1-12. doi: 10.1200/CCI.19.00034.
6
Cross-institution natural language processing for reliable clinical association studies: a methodological exploration.跨机构自然语言处理在可靠临床关联研究中的应用:一种方法学探索。
J Clin Epidemiol. 2024 Mar;167:111258. doi: 10.1016/j.jclinepi.2024.111258. Epub 2024 Jan 14.
7
Development and Use of Natural Language Processing for Identification of Distant Cancer Recurrence and Sites of Distant Recurrence Using Unstructured Electronic Health Record Data.利用自然语言处理开发和使用非结构化电子健康记录数据识别远处癌症复发和远处复发部位。
JCO Clin Cancer Inform. 2021 Apr;5:469-478. doi: 10.1200/CCI.20.00165.
8
Using natural language processing to improve efficiency of manual chart abstraction in research: the case of breast cancer recurrence.利用自然语言处理提高研究中手动图表提取的效率:以乳腺癌复发为例。
Am J Epidemiol. 2014 Mar 15;179(6):749-58. doi: 10.1093/aje/kwt441. Epub 2014 Jan 30.
9
Automating Access to Real-World Evidence.实现真实世界证据获取的自动化。
JTO Clin Res Rep. 2022 May 17;3(6):100340. doi: 10.1016/j.jtocrr.2022.100340. eCollection 2022 Jun.
10
Natural Language Processing for Automated Quantification of Brain Metastases Reported in Free-Text Radiology Reports.用于对自由文本放射学报告中报告的脑转移瘤进行自动定量的自然语言处理
JCO Clin Cancer Inform. 2019 Apr;3:1-9. doi: 10.1200/CCI.18.00138.

引用本文的文献

1
CACER: Clinical concept Annotations for Cancer Events and Relations.CACER:癌症事件与关系的临床概念注释。
J Am Med Inform Assoc. 2024 Nov 1;31(11):2583-2594. doi: 10.1093/jamia/ocae231.
2
The importance of studying the implementation of cancer data standards.研究癌症数据标准实施情况的重要性。
Cancer. 2025 Jan 1;131(1):e35441. doi: 10.1002/cncr.35441. Epub 2024 Jun 14.
3
Natural Language Processing for Breast Imaging: A Systematic Review.用于乳腺成像的自然语言处理:一项系统综述。
Diagnostics (Basel). 2023 Apr 14;13(8):1420. doi: 10.3390/diagnostics13081420.
4
Exploration of biomedical knowledge for recurrent glioblastoma using natural language processing deep learning models.利用自然语言处理深度学习模型探索复发性脑胶质瘤的生物医学知识。
BMC Med Inform Decis Mak. 2022 Oct 13;22(1):267. doi: 10.1186/s12911-022-02003-4.

本文引用的文献

1
Deep Learning to Estimate RECIST in Patients with NSCLC Treated with PD-1 Blockade.深度学习估计 NSCLC 患者接受 PD-1 阻断治疗后的 RECIST。
Cancer Discov. 2021 Jan;11(1):59-67. doi: 10.1158/2159-8290.CD-20-0419. Epub 2020 Sep 21.
2
Natural Language Processing to Ascertain Cancer Outcomes From Medical Oncologist Notes.自然语言处理从肿瘤医生的病历中确定癌症结果。
JCO Clin Cancer Inform. 2020 Aug;4:680-690. doi: 10.1200/CCI.20.00020.
3
International evaluation of an AI system for breast cancer screening.国际乳腺癌筛查人工智能系统评估。
Nature. 2020 Jan;577(7788):89-94. doi: 10.1038/s41586-019-1799-6. Epub 2020 Jan 1.
4
Deep learning in clinical natural language processing: a methodical review.深度学习在临床自然语言处理中的应用:系统综述。
J Am Med Inform Assoc. 2020 Mar 1;27(3):457-470. doi: 10.1093/jamia/ocz200.
5
Natural Language Processing Approaches to Detect the Timeline of Metastatic Recurrence of Breast Cancer.用于检测乳腺癌转移复发时间线的自然语言处理方法
JCO Clin Cancer Inform. 2019 Oct;3:1-12. doi: 10.1200/CCI.19.00034.
6
Evaluating shallow and deep learning strategies for the 2018 n2c2 shared task on clinical text classification.评估浅层和深度学习策略在 2018 n2c2 临床文本分类共享任务中的应用。
J Am Med Inform Assoc. 2019 Nov 1;26(11):1247-1254. doi: 10.1093/jamia/ocz149.
7
Characterizing the Feasibility and Performance of Real-World Tumor Progression End Points and Their Association With Overall Survival in a Large Advanced Non-Small-Cell Lung Cancer Data Set.在一个大型晚期非小细胞肺癌数据集中,表征真实世界肿瘤进展终点的可行性和性能及其与总生存期的关联。
JCO Clin Cancer Inform. 2019 Aug;3:1-13. doi: 10.1200/CCI.19.00013.
8
Assessment of Deep Natural Language Processing in Ascertaining Oncologic Outcomes From Radiology Reports.评估深度自然语言处理在从放射学报告中确定肿瘤学结果方面的应用
JAMA Oncol. 2019 Oct 1;5(10):1421-1429. doi: 10.1001/jamaoncol.2019.1800.
9
Do Neural Information Extraction Algorithms Generalize Across Institutions?神经信息提取算法在不同机构间具有通用性吗?
JCO Clin Cancer Inform. 2019 Jul;3:1-8. doi: 10.1200/CCI.18.00160.
10
Development and validation of a deep learning algorithm for improving Gleason scoring of prostate cancer.一种用于改善前列腺癌Gleason评分的深度学习算法的开发与验证
NPJ Digit Med. 2019 Jun 7;2:48. doi: 10.1038/s41746-019-0112-2. eCollection 2019.