人工智能辅助放射学报告中的癌症状态检测。

Artificial Intelligence-Assisted Cancer Status Detection in Radiology Reports.

机构信息

Digital, Informatics and Technology Solutions, Memorial Sloan Kettering Cancer Center, New York, New York.

Department of Translational Informatics, Memorial Sloan Kettering Cancer Center, New York, New York.

出版信息

Cancer Res Commun. 2024 Apr 9;4(4):1041-1049. doi: 10.1158/2767-9764.CRC-24-0064.

DOI:10.1158/2767-9764.CRC-24-0064

PMID:38592452

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11003452/

Abstract

UNLABELLED

Cancer research is dependent on accurate and relevant information of patient's medical journey. Data in radiology reports are of extreme value but lack consistent structure for direct use in analytics. At Memorial Sloan Kettering Cancer Center (MSKCC), the radiology reports are curated using gold-standard approach of using human annotators. However, the manual process of curating large volume of retrospective data slows the pace of cancer research. Manual curation process is sensitive to volume of reports, number of data elements and nature of reports and demand appropriate skillset. In this work, we explore state of the art methods in artificial intelligence (AI) and implement end-to-end pipeline for fast and accurate annotation of radiology reports. Language models (LM) are trained using curated data by approaching curation as multiclass or multilabel classification problem. The classification tasks are to predict multiple imaging scan sites, presence of cancer and cancer status from the reports. The trained natural language processing (NLP) model classifiers achieve high weighted F1 score and accuracy. We propose and demonstrate the use of these models to assist in the manual curation process which results in higher accuracy and F1 score with lesser time and cost, thus improving efforts of cancer research.

SIGNIFICANCE

Extraction of structured data in radiology for cancer research with manual process is laborious. Using AI for extraction of data elements is achieved using NLP models' assistance is faster and more accurate.

摘要

未加标签

癌症研究依赖于患者医疗历程的准确和相关信息。放射学报告中的数据极具价值，但缺乏直接用于分析的一致结构。在纪念斯隆凯特琳癌症中心（MSKCC），放射学报告是使用人类注释员的黄金标准方法进行整理的。然而，整理大量回顾性数据的手动过程会减缓癌症研究的步伐。手动整理过程对报告的数量、数据元素的数量和性质以及所需的技能组合都很敏感。在这项工作中，我们探索了人工智能（AI）的最新方法，并实施了端到端管道，以快速准确地注释放射学报告。使用通过将整理视为多类或多标签分类问题整理的数据来训练语言模型（LM）。分类任务是从报告中预测多个成像扫描部位、癌症的存在和癌症状态。经过训练的自然语言处理（NLP）模型分类器实现了较高的加权 F1 分数和准确性。我们提出并证明了这些模型在辅助手动整理过程中的使用，这可以提高准确性和 F1 分数，同时减少时间和成本，从而提高癌症研究的效率。

意义

使用手动过程从放射学中提取癌症研究的结构化数据既费力又耗时。使用 AI 提取数据元素可以通过 NLP 模型的辅助更快、更准确地实现。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63d7/11003452/d5de7d27fe88/crc-24-0064_fig1.jpg

相似文献

Artificial Intelligence-Assisted Cancer Status Detection in Radiology Reports.人工智能辅助放射学报告中的癌症状态检测。

Cancer Res Commun. 2024 Apr 9;4(4):1041-1049. doi: 10.1158/2767-9764.CRC-24-0064.

Assessing Laterality Errors in Radiology: Comparing Generative Artificial Intelligence and Natural Language Processing.评估放射学中的侧性错误：生成式人工智能与自然语言处理的比较。

J Am Coll Radiol. 2024 Oct;21(10):1575-1582. doi: 10.1016/j.jacr.2024.06.014. Epub 2024 Jul 1.

Developing a Cancer Digital Twin: Supervised Metastases Detection From Consecutive Structured Radiology Reports.开发癌症数字孪生模型：基于连续结构化放射学报告的转移性病变监督检测

Front Artif Intell. 2022 Mar 2;5:826402. doi: 10.3389/frai.2022.826402. eCollection 2022.

Evaluating the accuracy of lung-RADS score extraction from radiology reports: Manual entry versus natural language processing.评估从放射学报告中提取肺影像报告和数据系统（Lung-RADS）评分的准确性：手动录入与自然语言处理。

Int J Med Inform. 2024 Nov;191:105580. doi: 10.1016/j.ijmedinf.2024.105580. Epub 2024 Jul 31.

Clinical Concept-Based Radiology Reports Classification Pipeline for Lung Carcinoma.基于临床概念的肺癌放射学报告分类流水线。

J Digit Imaging. 2023 Jun;36(3):812-826. doi: 10.1007/s10278-023-00787-z. Epub 2023 Feb 14.

Automating Access to Real-World Evidence.实现真实世界证据获取的自动化。

JTO Clin Res Rep. 2022 May 17;3(6):100340. doi: 10.1016/j.jtocrr.2022.100340. eCollection 2022 Jun.

Automated anonymization of radiology reports: comparison of publicly available natural language processing and large language models.放射学报告的自动匿名化：公开可用的自然语言处理与大语言模型的比较

Eur Radiol. 2025 May;35(5):2634-2641. doi: 10.1007/s00330-024-11148-x. Epub 2024 Oct 31.

Information extraction from weakly structured radiological reports with natural language queries.利用自然语言查询从弱结构放射学报告中提取信息。

Eur Radiol. 2024 Jan;34(1):330-337. doi: 10.1007/s00330-023-09977-3. Epub 2023 Jul 28.

Automatic text classification of actionable radiology reports of tinnitus patients using bidirectional encoder representations from transformer (BERT) and in-domain pre-training (IDPT).使用基于转换器的双向编码器表示 (BERT) 和领域内预训练 (IDPT) 对耳鸣患者的可操作放射学报告进行自动文本分类。

BMC Med Inform Decis Mak. 2022 Jul 30;22(1):200. doi: 10.1186/s12911-022-01946-y.

Development and External Validation of an Artificial Intelligence Model for Identifying Radiology Reports Containing Recommendations for Additional Imaging.开发和外部验证用于识别包含额外成像建议的放射学报告的人工智能模型。

AJR Am J Roentgenol. 2023 Sep;221(3):377-385. doi: 10.2214/AJR.23.29120. Epub 2023 Apr 19.

引用本文的文献

Data Extraction and Curation from Radiology Reports for Pancreatic Cyst Surveillance Using Large Language Models.使用大语言模型从放射学报告中提取和整理胰腺囊肿监测数据

J Am Coll Surg. 2025 Jul 10. doi: 10.1097/XCS.0000000000001478.

Large Language Models in Cancer Imaging: Applications and Future Perspectives.癌症成像中的大语言模型：应用与未来展望。

J Clin Med. 2025 May 8;14(10):3285. doi: 10.3390/jcm14103285.

Large Language Model Applications for Health Information Extraction in Oncology: Scoping Review.用于肿瘤学健康信息提取的大语言模型应用：范围综述

JMIR Cancer. 2025 Mar 28;11:e65984. doi: 10.2196/65984.

本文引用的文献

A study of generative large language model for medical research and healthcare.一项关于用于医学研究和医疗保健的生成式大语言模型的研究。

NPJ Digit Med. 2023 Nov 16;6(1):210. doi: 10.1038/s41746-023-00958-w.

Applying Natural Language Processing to Single-Report Prediction of Metastatic Disease Response Using the OR-RADS Lexicon.运用自然语言处理技术，借助OR-RADS词汇表对转移性疾病反应进行单报告预测。

Cancers (Basel). 2023 Oct 10;15(20):4909. doi: 10.3390/cancers15204909.

Feasibility of Using the Privacy-preserving Large Language Model Vicuna for Labeling Radiology Reports.使用隐私保护的大型语言模型 Vicuna 对放射科报告进行标注的可行性研究。

Radiology. 2023 Oct;309(1):e231147. doi: 10.1148/radiol.231147.

Potential of ChatGPT and GPT-4 for Data Mining of Free-Text CT Reports on Lung Cancer.ChatGPT 和 GPT-4 在挖掘肺癌 CT 报告自由文本数据方面的潜力

Radiology. 2023 Sep;308(3):e231362. doi: 10.1148/radiol.231362.

Large language models in medicine.医学中的大型语言模型。

Nat Med. 2023 Aug;29(8):1930-1940. doi: 10.1038/s41591-023-02448-8. Epub 2023 Jul 17.

Large language models encode clinical knowledge.大语言模型编码临床知识。

Nature. 2023 Aug;620(7972):172-180. doi: 10.1038/s41586-023-06291-2. Epub 2023 Jul 12.

Transformer versus traditional natural language processing: how much data is enough for automated radiology report classification?Transformer 与传统自然语言处理：自动化放射科报告分类需要多少数据？

Br J Radiol. 2023 Sep;96(1149):20220769. doi: 10.1259/bjr.20220769. Epub 2023 May 25.

A large language model for electronic health records.用于电子健康记录的大型语言模型。

NPJ Digit Med. 2022 Dec 26;5(1):194. doi: 10.1038/s41746-022-00742-2.

Deep Learning-based Assessment of Oncologic Outcomes from Natural Language Processing of Structured Radiology Reports.基于深度学习的结构化放射学报告自然语言处理对肿瘤学结果的评估

Radiol Artif Intell. 2022 Jul 20;4(5):e220055. doi: 10.1148/ryai.220055. eCollection 2022 Sep.

Performance of Multiple Pretrained BERT Models to Automate and Accelerate Data Annotation for Large Datasets.多个预训练BERT模型在为大型数据集自动执行和加速数据标注方面的性能。

Radiol Artif Intell. 2022 Jun 29;4(4):e220007. doi: 10.1148/ryai.220007. eCollection 2022 Jul.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

人工智能辅助放射学报告中的癌症状态检测。

Artificial Intelligence-Assisted Cancer Status Detection in Radiology Reports.

机构信息

出版信息

UNLABELLED

SIGNIFICANCE

未加标签

意义

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献