• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用自然语言处理技术从大规模放射学报告中推断转移性疾病部位。

Use of Natural Language Processing to Infer Sites of Metastatic Disease From Radiology Reports at Scale.

机构信息

Division of Medical Oncology, National Cancer Centre Singapore, Singapore, Singapore.

NUS Yong Loo Lin School of Medicine, Singapore, Singapore.

出版信息

JCO Clin Cancer Inform. 2024 May;8:e2300122. doi: 10.1200/CCI.23.00122.

DOI:10.1200/CCI.23.00122
PMID:38788166
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11371090/
Abstract

PURPOSE

To evaluate natural language processing (NLP) methods to infer metastatic sites from radiology reports.

METHODS

A set of 4,522 computed tomography (CT) reports of 550 patients with 14 types of cancer was used to fine-tune four clinical large language models (LLMs) for multilabel classification of metastatic sites. We also developed an NLP information extraction (IE) system (on the basis of named entity recognition, assertion status detection, and relation extraction) for comparison. Model performances were measured by F1 scores on test and three external validation sets. The best model was used to facilitate analysis of metastatic frequencies in a cohort study of 6,555 patients with 53,838 CT reports.

RESULTS

The RadBERT, BioBERT, GatorTron-base, and GatorTron-medium LLMs achieved F1 scores of 0.84, 0.87, 0.89, and 0.91, respectively, on the test set. The IE system performed best, achieving an F1 score of 0.93. F1 scores of the IE system by individual cancer type ranged from 0.89 to 0.96. The IE system attained F1 scores of 0.89, 0.83, and 0.81, respectively, on external validation sets including additional cancer types, positron emission tomography-CT ,and magnetic resonance imaging scans, respectively. In our cohort study, we found that for colorectal cancer, liver-only metastases were higher in de novo stage IV versus recurrent patients (29.7% 12.2%; < .001). Conversely, lung-only metastases were more frequent in recurrent versus de novo stage IV patients (17.2% 7.3%; < .001).

CONCLUSION

We developed an IE system that accurately infers metastatic sites in multiple primary cancers from radiology reports. It has explainable methods and performs better than some clinical LLMs. The inferred metastatic phenotypes could enhance cancer research databases and clinical trial matching, and identify potential patients for oligometastatic interventions.

摘要

目的

评估自然语言处理(NLP)方法从放射学报告推断转移部位。

方法

使用一组 550 名 14 种癌症患者的 4522 份计算机断层扫描(CT)报告,对 4 种临床大语言模型(LLM)进行微调,以对转移部位进行多标签分类。我们还开发了一个基于命名实体识别、断言状态检测和关系提取的 NLP 信息提取(IE)系统进行比较。模型性能通过测试和三个外部验证集的 F1 分数进行衡量。使用最佳模型在一项 6555 名患者 53838 份 CT 报告的队列研究中促进转移频率的分析。

结果

RadBERT、BioBERT、GatorTron-base 和 GatorTron-medium LLM 在测试集上的 F1 分数分别为 0.84、0.87、0.89 和 0.91。IE 系统表现最佳,F1 得分为 0.93。IE 系统在单个癌症类型的 F1 得分范围为 0.89 至 0.96。IE 系统在包括其他癌症类型、正电子发射断层扫描-CT 和磁共振成像扫描的三个外部验证集上的 F1 分数分别为 0.89、0.83 和 0.81。在我们的队列研究中,我们发现对于结直肠癌,新发 IV 期与复发性患者的肝转移更高(29.7% 12.2%;<.001)。相反,复发性 IV 期患者的肺转移更常见(17.2% 7.3%;<.001)。

结论

我们开发了一种 IE 系统,该系统可从放射学报告中准确推断多种原发性癌症的转移部位。它具有可解释的方法,性能优于一些临床 LLM。推断出的转移表型可以增强癌症研究数据库和临床试验匹配,并识别出潜在的寡转移干预患者。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22d3/11371090/840df5febc9e/cci-8-e2300122-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22d3/11371090/7fc0836958c7/cci-8-e2300122-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22d3/11371090/840df5febc9e/cci-8-e2300122-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22d3/11371090/7fc0836958c7/cci-8-e2300122-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22d3/11371090/840df5febc9e/cci-8-e2300122-g002.jpg

相似文献

1
Use of Natural Language Processing to Infer Sites of Metastatic Disease From Radiology Reports at Scale.利用自然语言处理技术从大规模放射学报告中推断转移性疾病部位。
JCO Clin Cancer Inform. 2024 May;8:e2300122. doi: 10.1200/CCI.23.00122.
2
Patterns of Metastatic Disease in Patients with Cancer Derived from Natural Language Processing of Structured CT Radiology Reports over a 10-year Period.基于自然语言处理技术的结构化 CT 放射学报告分析 10 年间癌症患者转移病灶的模式。
Radiology. 2021 Oct;301(1):115-122. doi: 10.1148/radiol.2021210043. Epub 2021 Aug 3.
3
Clinical Concept-Based Radiology Reports Classification Pipeline for Lung Carcinoma.基于临床概念的肺癌放射学报告分类流水线。
J Digit Imaging. 2023 Jun;36(3):812-826. doi: 10.1007/s10278-023-00787-z. Epub 2023 Feb 14.
4
Development and Validation of a Model to Identify Critical Brain Injuries Using Natural Language Processing of Text Computed Tomography Reports.利用文本计算机断层扫描报告的自然语言处理开发和验证一种识别关键脑损伤的模型。
JAMA Netw Open. 2022 Aug 1;5(8):e2227109. doi: 10.1001/jamanetworkopen.2022.27109.
5
Natural language processing to predict isocitrate dehydrogenase genotype in diffuse glioma using MR radiology reports.基于磁共振影像学报告的自然语言处理预测弥漫性脑胶质瘤异柠檬酸脱氢酶基因型
Eur Radiol. 2023 Nov;33(11):8017-8025. doi: 10.1007/s00330-023-10061-z. Epub 2023 Aug 11.
6
Transformer versus traditional natural language processing: how much data is enough for automated radiology report classification?Transformer 与传统自然语言处理:自动化放射科报告分类需要多少数据?
Br J Radiol. 2023 Sep;96(1149):20220769. doi: 10.1259/bjr.20220769. Epub 2023 May 25.
7
Model tuning or prompt Tuning? a study of large language models for clinical concept and relation extraction.模型调优还是提示调优?大型语言模型在临床概念和关系抽取中的应用研究。
J Biomed Inform. 2024 May;153:104630. doi: 10.1016/j.jbi.2024.104630. Epub 2024 Mar 26.
8
Natural Language Processing of Computed Tomography Reports to Label Metastatic Phenotypes With Prognostic Significance in Patients With Colorectal Cancer.基于自然语言处理的电子计算机断层扫描报告对结直肠癌患者具有预后意义的转移表型进行分类。
JCO Clin Cancer Inform. 2022 Sep;6:e2200014. doi: 10.1200/CCI.22.00014.
9
Evaluating the accuracy of lung-RADS score extraction from radiology reports: Manual entry versus natural language processing.评估从放射学报告中提取肺影像报告和数据系统(Lung-RADS)评分的准确性:手动录入与自然语言处理。
Int J Med Inform. 2024 Nov;191:105580. doi: 10.1016/j.ijmedinf.2024.105580. Epub 2024 Jul 31.
10
Natural Language Processing for Identification of Incidental Pulmonary Nodules in Radiology Reports.自然语言处理在放射学报告中识别偶然肺部结节的应用。
J Am Coll Radiol. 2019 Nov;16(11):1587-1594. doi: 10.1016/j.jacr.2019.04.026. Epub 2019 May 24.

引用本文的文献

1
Evaluating acute image ordering for real-world patient cases via language model alignment with radiological guidelines.通过与放射学指南的语言模型对齐来评估真实世界患者病例的急性影像检查单开具情况。
Commun Med (Lond). 2025 Aug 4;5(1):332. doi: 10.1038/s43856-025-01061-9.
2
Large Language Models in Cancer Imaging: Applications and Future Perspectives.癌症成像中的大语言模型:应用与未来展望。
J Clin Med. 2025 May 8;14(10):3285. doi: 10.3390/jcm14103285.
3
Large language models in cancer: potentials, risks, and safeguards.癌症领域的大语言模型:潜力、风险与保障措施

本文引用的文献

1
A large language model for electronic health records.用于电子健康记录的大型语言模型。
NPJ Digit Med. 2022 Dec 26;5(1):194. doi: 10.1038/s41746-022-00742-2.
2
RadBERT: Adapting Transformer-based Language Models to Radiology.RadBERT:使基于Transformer的语言模型适用于放射学领域。
Radiol Artif Intell. 2022 Jun 15;4(4):e210258. doi: 10.1148/ryai.210258. eCollection 2022 Jul.
3
Natural Language Processing of Large-Scale Structured Radiology Reports to Identify Oncologic Patients With or Without Splenomegaly Over a 10-Year Period.
BJR Artif Intell. 2024 Dec 20;2(1):ubae019. doi: 10.1093/bjrai/ubae019. eCollection 2025 Jan.
10 年间基于自然语言处理的大规模结构化放射学报告,以识别有或无脾肿大的肿瘤患者。
JCO Clin Cancer Inform. 2022 Jan;6:e2100104. doi: 10.1200/CCI.21.00104.
4
Mechanisms, Diagnosis and Treatment of Bone Metastases.机制、诊断与骨转移治疗。
Cells. 2021 Oct 29;10(11):2944. doi: 10.3390/cells10112944.
5
Borderline resectable for colorectal liver metastases: Present status and future perspective.结直肠癌肝转移的可切除边界:现状与未来展望
World J Gastrointest Surg. 2021 Aug 27;13(8):756-763. doi: 10.4240/wjgs.v13.i8.756.
6
Patterns of Metastatic Disease in Patients with Cancer Derived from Natural Language Processing of Structured CT Radiology Reports over a 10-year Period.基于自然语言处理技术的结构化 CT 放射学报告分析 10 年间癌症患者转移病灶的模式。
Radiology. 2021 Oct;301(1):115-122. doi: 10.1148/radiol.2021210043. Epub 2021 Aug 3.
7
Automated Electronic Health Record-Based Tool for Identification of Patients With Metastatic Disease to Facilitate Clinical Trial Patient Ascertainment.基于自动化电子健康记录的工具,用于识别转移性疾病患者,以方便临床试验患者的确定。
JCO Clin Cancer Inform. 2021 Jun;5:719-727. doi: 10.1200/CCI.20.00180.
8
Development and Use of Natural Language Processing for Identification of Distant Cancer Recurrence and Sites of Distant Recurrence Using Unstructured Electronic Health Record Data.利用自然语言处理开发和使用非结构化电子健康记录数据识别远处癌症复发和远处复发部位。
JCO Clin Cancer Inform. 2021 Apr;5:469-478. doi: 10.1200/CCI.20.00165.
9
Metastases-directed Radiotherapy in Addition to Standard Systemic Therapy in Patients with Oligometastatic Breast Cancer: Study protocol for a randomized controlled multi-national and multi-center clinical trial (OLIGOMA).寡转移乳腺癌患者在标准全身治疗基础上加用转移灶定向放疗:一项随机对照多国家多中心临床试验(OLIGOMA)的研究方案
Clin Transl Radiat Oncol. 2021 Apr 5;28:90-96. doi: 10.1016/j.ctro.2021.03.012. eCollection 2021 May.
10
Accuracy of an Artificial Intelligence System for Cancer Clinical Trial Eligibility Screening: Retrospective Pilot Study.用于癌症临床试验资格筛查的人工智能系统的准确性:回顾性试点研究。
JMIR Med Inform. 2021 Mar 26;9(3):e27767. doi: 10.2196/27767.