• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Development and Use of Natural Language Processing for Identification of Distant Cancer Recurrence and Sites of Distant Recurrence Using Unstructured Electronic Health Record Data.利用自然语言处理开发和使用非结构化电子健康记录数据识别远处癌症复发和远处复发部位。
JCO Clin Cancer Inform. 2021 Apr;5:469-478. doi: 10.1200/CCI.20.00165.
2
Using natural language processing to improve efficiency of manual chart abstraction in research: the case of breast cancer recurrence.利用自然语言处理提高研究中手动图表提取的效率:以乳腺癌复发为例。
Am J Epidemiol. 2014 Mar 15;179(6):749-58. doi: 10.1093/aje/kwt441. Epub 2014 Jan 30.
3
Natural Language Processing Approaches to Detect the Timeline of Metastatic Recurrence of Breast Cancer.用于检测乳腺癌转移复发时间线的自然语言处理方法
JCO Clin Cancer Inform. 2019 Oct;3:1-12. doi: 10.1200/CCI.19.00034.
4
Augmented intelligence with natural language processing applied to electronic health records for identifying patients with non-alcoholic fatty liver disease at risk for disease progression.应用自然语言处理的增强型人工智能用于电子健康记录,以识别非酒精性脂肪性肝病患者中疾病进展风险较高的患者。
Int J Med Inform. 2019 Sep;129:334-341. doi: 10.1016/j.ijmedinf.2019.06.028. Epub 2019 Jul 6.
5
Use of Natural Language Processing Tools to Identify and Classify Periprosthetic Femur Fractures.使用自然语言处理工具识别和分类股骨假体周围骨折。
J Arthroplasty. 2019 Oct;34(10):2216-2219. doi: 10.1016/j.arth.2019.07.025. Epub 2019 Jul 24.
6
Ascertainment of Delirium Status Using Natural Language Processing From Electronic Health Records.使用电子健康记录中的自然语言处理来确定谵妄状态。
J Gerontol A Biol Sci Med Sci. 2022 Mar 3;77(3):524-530. doi: 10.1093/gerona/glaa275.
7
Comparison of Natural Language Processing of Clinical Notes With a Validated Risk-Stratification Tool to Predict Severe Maternal Morbidity.临床记录的自然语言处理与验证的风险分层工具预测严重产妇发病率的比较。
JAMA Netw Open. 2022 Oct 3;5(10):e2234924. doi: 10.1001/jamanetworkopen.2022.34924.
8
Validation of a Zero-shot Learning Natural Language Processing Tool to Facilitate Data Abstraction for Urologic Research.用于促进泌尿外科研究数据提取的零样本学习自然语言处理工具的验证
Eur Urol Focus. 2024 Mar;10(2):279-287. doi: 10.1016/j.euf.2024.01.009. Epub 2024 Jan 25.
9
Machine learning and natural language processing (NLP) approach to predict early progression to first-line treatment in real-world hormone receptor-positive (HR+)/HER2-negative advanced breast cancer patients.机器学习和自然语言处理(NLP)方法预测激素受体阳性(HR+)/HER2 阴性晚期乳腺癌患者一线治疗的早期进展。
Eur J Cancer. 2021 Feb;144:224-231. doi: 10.1016/j.ejca.2020.11.030. Epub 2020 Dec 26.
10
Discerning tumor status from unstructured MRI reports--completeness of information in existing reports and utility of automated natural language processing.从非结构化 MRI 报告中辨别肿瘤状态——现有报告中信息的完整性和自动化自然语言处理的实用性。
J Digit Imaging. 2010 Apr;23(2):119-32. doi: 10.1007/s10278-009-9215-7. Epub 2009 May 30.

引用本文的文献

1
An Artificial Intelligence Pipeline for Hepatocellular Carcinoma: From Data to Treatment Recommendations.一种用于肝细胞癌的人工智能流程:从数据到治疗建议
Int J Gen Med. 2025 Jul 2;18:3581-3595. doi: 10.2147/IJGM.S529322. eCollection 2025.
2
From manual clinical criteria to machine learning algorithms: Comparing outcome endpoints derived from diverse electronic health record data modalities.从手动临床标准到机器学习算法:比较源自不同电子健康记录数据模式的结局终点。
PLOS Digit Health. 2025 May 14;4(5):e0000755. doi: 10.1371/journal.pdig.0000755. eCollection 2025 May.
3
Using Large Language Models to Automate Data Extraction From Surgical Pathology Reports: Retrospective Cohort Study.使用大语言模型自动从外科病理报告中提取数据:回顾性队列研究。
JMIR Form Res. 2025 Apr 7;9:e64544. doi: 10.2196/64544.
4
Artificial intelligence across oncology specialties: current applications and emerging tools.肿瘤学各专业中的人工智能:当前应用与新兴工具
BMJ Oncol. 2024 Jan 17;3(1):e000134. doi: 10.1136/bmjonc-2023-000134. eCollection 2024.
5
Automated Identification of Breast Cancer Relapse in Computed Tomography Reports Using Natural Language Processing.使用自然语言处理技术在计算机断层扫描报告中自动识别乳腺癌复发情况
JCO Clin Cancer Inform. 2024 Dec;8:e2400107. doi: 10.1200/CCI.24.00107. Epub 2024 Dec 20.
6
Breast cancer learning health system: Patient information from a data and analytics platform characterizes care provided.乳腺癌学习型健康系统:来自数据与分析平台的患者信息可描述所提供的护理情况。
Learn Health Syst. 2024 Feb 13;8(3):e10409. doi: 10.1002/lrh2.10409. eCollection 2024 Jul.
7
Use of Natural Language Processing to Infer Sites of Metastatic Disease From Radiology Reports at Scale.利用自然语言处理技术从大规模放射学报告中推断转移性疾病部位。
JCO Clin Cancer Inform. 2024 May;8:e2300122. doi: 10.1200/CCI.23.00122.
8
Development of an Automatic Rule-Based Algorithm for the Detection of Ovarian Cancer Recurrence From Electronic Health Records.基于规则的自动算法在电子病历中卵巢癌复发检测的开发。
JCO Clin Cancer Inform. 2024 Mar;8:e2300150. doi: 10.1200/CCI.23.00150.
9
Automatic Detection of Distant Metastasis Mentions in Radiology Reports in Spanish.自动检测西班牙语放射学报告中的远处转移提及。
JCO Clin Cancer Inform. 2024 Jan;8:e2300130. doi: 10.1200/CCI.23.00130.
10
DeepPhe-CR: Natural Language Processing Software Services for Cancer Registrar Case Abstraction.DeepPhe-CR:用于癌症登记员病例提取的自然语言处理软件服务。
JCO Clin Cancer Inform. 2023 Sep;7:e2300156. doi: 10.1200/CCI.23.00156.

本文引用的文献

1
Feasibility of Using Real-World Data to Replicate Clinical Trial Evidence.利用真实世界数据复制临床试验证据的可行性。
JAMA Netw Open. 2019 Oct 2;2(10):e1912869. doi: 10.1001/jamanetworkopen.2019.12869.
2
Natural Language Processing Approaches to Detect the Timeline of Metastatic Recurrence of Breast Cancer.用于检测乳腺癌转移复发时间线的自然语言处理方法
JCO Clin Cancer Inform. 2019 Oct;3:1-12. doi: 10.1200/CCI.19.00034.
3
Identifying Metastases-related Information from Pathology Reports of Lung Cancer Patients.从肺癌患者病理报告中识别转移相关信息。
AMIA Jt Summits Transl Sci Proc. 2017 Jul 26;2017:268-277. eCollection 2017.
4
Validation of International Classification of Diseases coding for bone metastases in electronic health records using technology-enabled abstraction.使用技术辅助提取对电子健康记录中骨转移的国际疾病分类编码进行验证。
Clin Epidemiol. 2015 Nov 11;7:441-8. doi: 10.2147/CLEP.S92209. eCollection 2015.
5
A hybrid approach to identify subsequent breast cancer using pathology and automated health information data.一种使用病理学和自动化健康信息数据来识别后续乳腺癌的混合方法。
Med Care. 2015 Apr;53(4):380-5. doi: 10.1097/MLR.0000000000000327.
6
Using natural language processing to improve efficiency of manual chart abstraction in research: the case of breast cancer recurrence.利用自然语言处理提高研究中手动图表提取的效率:以乳腺癌复发为例。
Am J Epidemiol. 2014 Mar 15;179(6):749-58. doi: 10.1093/aje/kwt441. Epub 2014 Jan 30.
7
Breast cancer treatment across health care systems: linking electronic medical records and state registry data to enable outcomes research.乳腺癌治疗在医疗保健系统中的应用:将电子病历和州级注册表数据相链接以支持成果研究。
Cancer. 2014 Jan 1;120(1):103-11. doi: 10.1002/cncr.28395. Epub 2013 Sep 24.
8
An Evaluation of Algorithms for Identifying Metastatic Breast, Lung, or Colorectal Cancer in Administrative Claims Data.行政索赔数据中转移性乳腺癌、肺癌或结直肠癌识别算法的评估
Med Care. 2015 Jul;53(7):e49-57. doi: 10.1097/MLR.0b013e318289c3fb.
9
Oncoshare: lessons learned from building an integrated multi-institutional database for comparative effectiveness research.Oncoshare:构建用于比较效果研究的综合多机构数据库的经验教训。
AMIA Annu Symp Proc. 2012;2012:970-8. Epub 2012 Nov 3.
10
Identifying primary and recurrent cancers using a SAS-based natural language processing algorithm.使用基于 SAS 的自然语言处理算法识别原发性和复发性癌症。
J Am Med Inform Assoc. 2013 Mar-Apr;20(2):349-55. doi: 10.1136/amiajnl-2012-000928. Epub 2012 Jul 21.

利用自然语言处理开发和使用非结构化电子健康记录数据识别远处癌症复发和远处复发部位。

Development and Use of Natural Language Processing for Identification of Distant Cancer Recurrence and Sites of Distant Recurrence Using Unstructured Electronic Health Record Data.

机构信息

Department of Medicine, Stanford University School of Medicine, Stanford, CA.

Department of Epidemiology and Population Health, Stanford University School of Medicine, Stanford, CA.

出版信息

JCO Clin Cancer Inform. 2021 Apr;5:469-478. doi: 10.1200/CCI.20.00165.

DOI:10.1200/CCI.20.00165
PMID:33929889
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8462655/
Abstract

PURPOSE

Large-scale analysis of real-world evidence is often limited to structured data fields that do not contain reliable information on recurrence status and disease sites. In this report, we describe a natural language processing (NLP) framework that uses data from free-text, unstructured reports to classify recurrence status and sites of recurrence for patients with breast and hepatocellular carcinomas (HCC).

METHODS

Using two cohorts of breast cancer and HCC cases, we validated the ability of a previously developed NLP model to distinguish between no recurrence, local recurrence, and distant recurrence, based on clinician notes, radiology reports, and pathology reports compared with manual curation. A second NLP model was trained and validated to identify sites of recurrence. We compared the ability of each NLP model to identify the presence, timing, and site of recurrence, when compared against manual chart review and International Classification of Diseases coding.

RESULTS

A total of 1,273 patients were included in the development and validation of the two models. The NLP model for recurrence detects distant recurrence with an area under the curve of 0.98 (95% CI, 0.96 to 0.99) and 0.95 (95% CI, 0.88 to 0.98) in breast and HCC cohorts, respectively. The mean accuracy of the NLP model for detecting any site of distant recurrence was 0.9 for breast cancer and 0.83 for HCC. The NLP model for recurrence identified a larger proportion of patients with distant recurrence in a breast cancer database (11.1%) compared with International Classification of Diseases coding (2.31%).

CONCLUSION

We developed two NLP models to identify distant cancer recurrence, timing of recurrence, and sites of recurrence based on unstructured electronic health record data. These models can be used to perform large-scale retrospective studies in oncology.

摘要

目的

真实世界证据的大规模分析通常仅限于结构化数据字段,这些字段不包含有关复发状态和疾病部位的可靠信息。在本报告中,我们描述了一种自然语言处理(NLP)框架,该框架使用来自自由文本、非结构化报告的数据来对乳腺癌和肝细胞癌(HCC)患者的复发状态和复发部位进行分类。

方法

使用两个乳腺癌和 HCC 病例队列,我们验证了先前开发的 NLP 模型根据临床医生的笔记、放射学报告和病理学报告,与手动策展相比,区分无复发、局部复发和远处复发的能力。还训练和验证了第二个 NLP 模型以识别复发部位。我们比较了每个 NLP 模型在与手动图表审查和国际疾病分类编码相比时,识别复发的存在、时间和部位的能力。

结果

共有 1273 名患者被纳入两个模型的开发和验证中。用于复发的 NLP 模型在乳腺癌和 HCC 队列中检测远处复发的曲线下面积分别为 0.98(95%CI,0.96 至 0.99)和 0.95(95%CI,0.88 至 0.98)。用于检测任何远处复发部位的 NLP 模型的平均准确率分别为乳腺癌的 0.9 和 HCC 的 0.83。与国际疾病分类编码(2.31%)相比,用于复发的 NLP 模型在乳腺癌数据库中识别出更多远处复发的患者(11.1%)。

结论

我们开发了两种 NLP 模型,以根据非结构化电子健康记录数据识别远处癌症复发、复发时间和复发部位。这些模型可用于在肿瘤学中进行大规模回顾性研究。