• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

将自然语言处理和机器学习算法集成到放射学报告中的肿瘤反应分类中。

Integrating Natural Language Processing and Machine Learning Algorithms to Categorize Oncologic Response in Radiology Reports.

机构信息

Department of Radiology, Perelman School of Medicine, Hospital of the University of Pennsylvania, 3400 Spruce Street, Philadelphia, PA, 19104, USA.

Musculoskeletal Imaging Division, Department of Radiology, Hospital of the University of Pennsylvania, 3400 Spruce St., 1 Silverstein, Philadelphia, PA, 19104, USA.

出版信息

J Digit Imaging. 2018 Apr;31(2):178-184. doi: 10.1007/s10278-017-0027-x.

DOI:10.1007/s10278-017-0027-x
PMID:29079959
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5873468/
Abstract

A significant volume of medical data remains unstructured. Natural language processing (NLP) and machine learning (ML) techniques have shown to successfully extract insights from radiology reports. However, the codependent effects of NLP and ML in this context have not been well-studied. Between April 1, 2015 and November 1, 2016, 9418 cross-sectional abdomen/pelvis CT and MR examinations containing our internal structured reporting element for cancer were separated into four categories: Progression, Stable Disease, Improvement, or No Cancer. We combined each of three NLP techniques with five ML algorithms to predict the assigned label using the unstructured report text and compared the performance of each combination. The three NLP algorithms included term frequency-inverse document frequency (TF-IDF), term frequency weighting (TF), and 16-bit feature hashing. The ML algorithms included logistic regression (LR), random decision forest (RDF), one-vs-all support vector machine (SVM), one-vs-all Bayes point machine (BPM), and fully connected neural network (NN). The best-performing NLP model consisted of tokenized unigrams and bigrams with TF-IDF. Increasing N-gram length yielded little to no added benefit for most ML algorithms. With all parameters optimized, SVM had the best performance on the test dataset, with 90.6 average accuracy and F score of 0.813. The interplay between ML and NLP algorithms and their effect on interpretation accuracy is complex. The best accuracy is achieved when both algorithms are optimized concurrently.

摘要

大量的医学数据仍然是非结构化的。自然语言处理(NLP)和机器学习(ML)技术已被证明可以成功地从放射学报告中提取见解。然而,在这种情况下,NLP 和 ML 的相互依存效应尚未得到很好的研究。在 2015 年 4 月 1 日至 2016 年 11 月 1 日期间,9418 项横断面腹部/骨盆 CT 和 MR 检查包含我们内部用于癌症的结构化报告元素,分为四类:进展、稳定疾病、改善或无癌症。我们将三种 NLP 技术中的每一种与五种 ML 算法相结合,使用非结构化报告文本预测分配的标签,并比较每种组合的性能。三种 NLP 算法包括词频-逆文档频率(TF-IDF)、词频加权(TF)和 16 位特征哈希。ML 算法包括逻辑回归(LR)、随机决策森林(RDF)、一对一支持向量机(SVM)、一对一贝叶斯点机(BPM)和全连接神经网络(NN)。表现最好的 NLP 模型由带有 TF-IDF 的标记化单字和双字组成。对于大多数 ML 算法来说,增加 N 元长度几乎没有带来额外的好处。在所有参数都得到优化的情况下,SVM 在测试数据集上的性能最好,平均准确率为 90.6%,F 得分为 0.813。ML 和 NLP 算法之间的相互作用及其对解释准确性的影响是复杂的。当两种算法都被同时优化时,准确性达到最佳。

相似文献

1
Integrating Natural Language Processing and Machine Learning Algorithms to Categorize Oncologic Response in Radiology Reports.将自然语言处理和机器学习算法集成到放射学报告中的肿瘤反应分类中。
J Digit Imaging. 2018 Apr;31(2):178-184. doi: 10.1007/s10278-017-0027-x.
2
Natural Language Processing of Radiology Reports in Patients With Hepatocellular Carcinoma to Predict Radiology Resource Utilization.肝细胞癌患者放射学报告的自然语言处理以预测放射学资源利用。
J Am Coll Radiol. 2019 Jun;16(6):840-844. doi: 10.1016/j.jacr.2018.12.004. Epub 2019 Mar 2.
3
Automated Classification of Free-Text Radiology Reports: Using Different Feature Extraction Methods to Identify Fractures of the Distal Fibula.自动化自由文本放射学报告分类:使用不同的特征提取方法识别腓骨远端骨折。
Rofo. 2023 Aug;195(8):713-719. doi: 10.1055/a-2061-6562. Epub 2023 May 9.
4
Comparison of an Ensemble of Machine Learning Models and the BERT Language Model for Analysis of Text Descriptions of Brain CT Reports to Determine the Presence of Intracranial Hemorrhage.基于机器学习模型集成与 BERT 语言模型的脑 CT 报告文本描述分析用于判断颅内出血的比较研究
Sovrem Tekhnologii Med. 2024;16(1):27-34. doi: 10.17691/stm2024.16.1.03. Epub 2024 Feb 28.
5
Natural language processing and machine learning algorithm to identify brain MRI reports with acute ischemic stroke.自然语言处理和机器学习算法识别急性缺血性脑卒中的脑部 MRI 报告。
PLoS One. 2019 Feb 28;14(2):e0212778. doi: 10.1371/journal.pone.0212778. eCollection 2019.
6
Medical subdomain classification of clinical notes using a machine learning-based natural language processing approach.基于机器学习的自然语言处理方法对临床笔记进行医学子域分类。
BMC Med Inform Decis Mak. 2017 Dec 1;17(1):155. doi: 10.1186/s12911-017-0556-8.
7
Natural language processing of head CT reports to identify intracranial mass effect: CTIME algorithm.通过头部CT报告的自然语言处理识别颅内占位效应:CTIME算法
Am J Emerg Med. 2022 Jan;51:388-392. doi: 10.1016/j.ajem.2021.11.001. Epub 2021 Nov 9.
8
Automated Detection of Radiology Reports that Require Follow-up Imaging Using Natural Language Processing Feature Engineering and Machine Learning Classification.使用自然语言处理特征工程和机器学习分类自动检测需要随访成像的放射学报告。
J Digit Imaging. 2020 Feb;33(1):131-136. doi: 10.1007/s10278-019-00271-7.
9
Natural Language Processing for Imaging Protocol Assignment: Machine Learning for Multiclass Classification of Abdominal CT Protocols Using Indication Text Data.基于自然语言处理的成像协议分配:使用指示文本数据进行多类分类的腹部 CT 协议的机器学习。
J Digit Imaging. 2022 Oct;35(5):1120-1130. doi: 10.1007/s10278-022-00633-8. Epub 2022 Jun 2.
10
Prediction of Stroke Outcome Using Natural Language Processing-Based Machine Learning of Radiology Report of Brain MRI.使用基于自然语言处理的脑磁共振成像放射学报告机器学习预测卒中结局
J Pers Med. 2020 Dec 16;10(4):286. doi: 10.3390/jpm10040286.

引用本文的文献

1
Comparing the influence of social risk factors on machine learning model performance across racial and ethnic groups in home healthcare.比较家庭医疗保健中社会风险因素对不同种族和族裔群体机器学习模型性能的影响。
Nurs Outlook. 2025 May-Jun;73(3):102431. doi: 10.1016/j.outlook.2025.102431. Epub 2025 May 7.
2
Automated Radiology Report Labeling in Chest X-Ray Pathologies: Development and Evaluation of a Large Language Model Framework.胸部X光病理学中的自动放射学报告标注:大语言模型框架的开发与评估
JMIR Med Inform. 2025 Mar 28;13:e68618. doi: 10.2196/68618.
3
[Transformation of free-text radiology reports into structured data].[将自由文本形式的放射学报告转换为结构化数据]
Radiologie (Heidelb). 2025 Apr;65(4):249-256. doi: 10.1007/s00117-025-01422-4. Epub 2025 Feb 11.
4
BERT-based natural language processing analysis of French CT reports: Application to the measurement of the positivity rate for pulmonary embolism.基于BERT的法语CT报告自然语言处理分析:在肺栓塞阳性率测量中的应用
Res Diagn Interv Imaging. 2023 Mar 27;6:100027. doi: 10.1016/j.redii.2023.100027. eCollection 2023 Jun.
5
ESR paper on structured reporting in radiology-update 2023.欧洲放射学会关于放射学结构化报告的论文——2023年更新版
Insights Imaging. 2023 Nov 23;14(1):199. doi: 10.1186/s13244-023-01560-0.
6
Artificial Intelligence to Improve Patient Understanding of Radiology Reports.人工智能提高患者对放射科报告的理解。
Yale J Biol Med. 2023 Sep 29;96(3):407-417. doi: 10.59249/NKOY5498. eCollection 2023 Sep.
7
Optimization of U-shaped pure transformer medical image segmentation network.U型纯变压器医学图像分割网络的优化
PeerJ Comput Sci. 2023 Aug 18;9:e1515. doi: 10.7717/peerj-cs.1515. eCollection 2023.
8
A Second-Order Network Structure Based on Gradient-Enhanced Physics-Informed Neural Networks for Solving Parabolic Partial Differential Equations.一种基于梯度增强物理信息神经网络的二阶网络结构,用于求解抛物型偏微分方程。
Entropy (Basel). 2023 Apr 18;25(4):674. doi: 10.3390/e25040674.
9
Natural Language Processing and Graph Theory: Making Sense of Imaging Records in a Novel Representation Frame.自然语言处理与图论:在一种新型表示框架中理解影像记录
JMIR Med Inform. 2022 Dec 21;10(12):e40534. doi: 10.2196/40534.
10
Assessment of Electronic Health Record for Cancer Research and Patient Care Through a Scoping Review of Cancer Natural Language Processing.通过癌症自然语言处理的范围综述评估癌症研究和患者护理的电子健康记录。
JCO Clin Cancer Inform. 2022 Jul;6:e2200006. doi: 10.1200/CCI.22.00006.

本文引用的文献

1
Creating an automated trigger for sepsis clinical decision support at emergency department triage using machine learning.利用机器学习在急诊科分诊时创建用于脓毒症临床决策支持的自动触发机制。
PLoS One. 2017 Apr 6;12(4):e0174708. doi: 10.1371/journal.pone.0174708. eCollection 2017.
2
Text mining electronic hospital records to automatically classify admissions against disease: Measuring the impact of linking data sources.通过文本挖掘电子医院记录自动对疾病入院情况进行分类:衡量链接数据源的影响。
J Biomed Inform. 2016 Dec;64:158-167. doi: 10.1016/j.jbi.2016.10.008. Epub 2016 Oct 11.
3
Finding Related Publications: Extending the Set of Terms Used to Assess Article Similarity.查找相关出版物:扩展用于评估文章相似度的术语集。
AMIA Jt Summits Transl Sci Proc. 2016 Jul 20;2016:225-34. eCollection 2016.
4
Biomedical event trigger detection by dependency-based word embedding.基于依存关系的词嵌入进行生物医学事件触发检测
BMC Med Genomics. 2016 Aug 10;9 Suppl 2(Suppl 2):45. doi: 10.1186/s12920-016-0203-8.
5
Natural Language Processing in Oncology: A Review.自然语言处理在肿瘤学中的应用:综述
JAMA Oncol. 2016 Jun 1;2(6):797-804. doi: 10.1001/jamaoncol.2016.0213.
6
Classification of clinically useful sentences in clinical evidence resources.临床证据资源中临床有用句子的分类。
J Biomed Inform. 2016 Apr;60:14-22. doi: 10.1016/j.jbi.2016.01.003. Epub 2016 Jan 13.
7
Natural Language Processing Technologies in Radiology Research and Clinical Applications.放射学研究与临床应用中的自然语言处理技术
Radiographics. 2016 Jan-Feb;36(1):176-91. doi: 10.1148/rg.2016150080.
8
Information extraction from multi-institutional radiology reports.从多机构放射学报告中提取信息。
Artif Intell Med. 2016 Jan;66:29-39. doi: 10.1016/j.artmed.2015.09.007. Epub 2015 Oct 3.
9
Unsupervised Topic Modeling in a Large Free Text Radiology Report Repository.大型自由文本放射学报告库中的无监督主题建模
J Digit Imaging. 2016 Feb;29(1):59-62. doi: 10.1007/s10278-015-9823-3.
10
Code Abdomen: An Assessment Coding Scheme for Abdominal Imaging Findings Possibly Representing Cancer.腹部编码:一种针对可能代表癌症的腹部影像表现的评估编码方案。
J Am Coll Radiol. 2015 Sep;12(9):947-50. doi: 10.1016/j.jacr.2015.04.005. Epub 2015 Jun 27.