Suppr超能文献

使用自然语言处理特征工程和机器学习分类自动检测需要随访成像的放射学报告。

Automated Detection of Radiology Reports that Require Follow-up Imaging Using Natural Language Processing Feature Engineering and Machine Learning Classification.

机构信息

Perelman School of Medicine at the University of Pennsylvania, 801 S 24th St #3, Philadelphia, PA, 19146, USA.

Hospital of the University of Pennsylvania, Philadelphia, PA, USA.

出版信息

J Digit Imaging. 2020 Feb;33(1):131-136. doi: 10.1007/s10278-019-00271-7.

Abstract

While radiologists regularly issue follow-up recommendations, our preliminary research has shown that anywhere from 35 to 50% of patients who receive follow-up recommendations for findings of possible cancer on abdominopelvic imaging do not return for follow-up. As such, they remain at risk for adverse outcomes related to missed or delayed cancer diagnosis. In this study, we develop an algorithm to automatically detect free text radiology reports that have a follow-up recommendation using natural language processing (NLP) techniques and machine learning models. The data set used in this study consists of 6000 free text reports from the author's institution. NLP techniques are used to engineer 1500 features, which include the most informative unigrams, bigrams, and trigrams in the training corpus after performing tokenization and Porter stemming. On this data set, we train naive Bayes, decision tree, and maximum entropy models. The decision tree model, with an F1 score of 0.458 and accuracy of 0.862, outperforms both the naive Bayes (F1 score of 0.381) and maximum entropy (F1 score of 0.387) models. The models were analyzed to determine predictive features, with term frequency of n-grams such as "renal neoplasm" and "evalu with enhanc" being most predictive of a follow-up recommendation. Key to maximizing performance was feature engineering that extracts predictive information and appropriate selection of machine learning algorithms based on the feature set.

摘要

虽然放射科医生经常会提出随访建议,但我们的初步研究表明,在接受腹部和盆腔影像学检查结果可能为癌症的随访建议的患者中,有 35%至 50%的患者并未进行随访。因此,他们仍然存在因癌症漏诊或延误诊断而导致不良后果的风险。在这项研究中,我们开发了一种算法,使用自然语言处理(NLP)技术和机器学习模型自动检测具有随访建议的自由文本放射科报告。本研究使用的数据集包含作者所在机构的 6000 份自由文本报告。使用 NLP 技术对 1500 个特征进行了工程设计,这些特征包括在进行标记和 Porter 词干化后,训练语料库中最具信息量的单字、双字和三字。在这个数据集上,我们训练了朴素贝叶斯、决策树和最大熵模型。决策树模型的 F1 得分为 0.458,准确率为 0.862,优于朴素贝叶斯(F1 得分为 0.381)和最大熵(F1 得分为 0.387)模型。对这些模型进行了分析,以确定预测特征,其中“肾肿瘤”和“增强评估”等 n 元组的词频是预测随访建议的最具预测性特征。最大限度地提高性能的关键是特征工程,它可以提取预测信息,并根据特征集选择适当的机器学习算法。

相似文献

4
Natural language processing and machine learning algorithm to identify brain MRI reports with acute ischemic stroke.
PLoS One. 2019 Feb 28;14(2):e0212778. doi: 10.1371/journal.pone.0212778. eCollection 2019.
7
Natural Language-based Machine Learning Models for the Annotation of Clinical Radiology Reports.
Radiology. 2018 May;287(2):570-580. doi: 10.1148/radiol.2018171093. Epub 2018 Jan 30.
8
Machine learning based natural language processing of radiology reports in orthopaedic trauma.
Comput Methods Programs Biomed. 2021 Sep;208:106304. doi: 10.1016/j.cmpb.2021.106304. Epub 2021 Jul 23.
10
Ensemble Approaches to Recognize Protected Health Information in Radiology Reports.
J Digit Imaging. 2022 Dec;35(6):1694-1698. doi: 10.1007/s10278-022-00673-0. Epub 2022 Jun 17.

引用本文的文献

2
Comprehensive comparison of the third-generation sequencing tools for bacterial 6mA profiling.
Nat Commun. 2025 Apr 28;16(1):3982. doi: 10.1038/s41467-025-59187-2.
5
Artificial Intelligence to Improve Patient Understanding of Radiology Reports.
Yale J Biol Med. 2023 Sep 29;96(3):407-417. doi: 10.59249/NKOY5498. eCollection 2023 Sep.
7
Artificial intelligence and machine learning in cancer imaging.
Commun Med (Lond). 2022 Oct 27;2:133. doi: 10.1038/s43856-022-00199-0. eCollection 2022.
10
The Use of BP Neural Network Algorithm and Natural Language Processing in the Impact of Social Audit on Enterprise Innovation Ability.
Comput Intell Neurosci. 2022 May 18;2022:7297769. doi: 10.1155/2022/7297769. eCollection 2022.

本文引用的文献

1
2
Deep Learning to Classify Radiology Free-Text Reports.
Radiology. 2018 Mar;286(3):845-852. doi: 10.1148/radiol.2017171115. Epub 2017 Nov 13.
4
Implementation of an Automated Radiology Recommendation-Tracking Engine for Abdominal Imaging Findings of Possible Cancer.
J Am Coll Radiol. 2017 May;14(5):629-636. doi: 10.1016/j.jacr.2017.01.024. Epub 2017 Mar 17.
7
Natural Language Processing in Radiology: A Systematic Review.
Radiology. 2016 May;279(2):329-43. doi: 10.1148/radiol.16142770.
8
Code Abdomen: An Assessment Coding Scheme for Abdominal Imaging Findings Possibly Representing Cancer.
J Am Coll Radiol. 2015 Sep;12(9):947-50. doi: 10.1016/j.jacr.2015.04.005. Epub 2015 Jun 27.
9
Semi-supervised clinical text classification with Laplacian SVMs: an application to cancer case management.
J Biomed Inform. 2013 Oct;46(5):869-75. doi: 10.1016/j.jbi.2013.06.014. Epub 2013 Jul 8.
10
Automated detection using natural language processing of radiologists recommendations for additional imaging of incidental findings.
Ann Emerg Med. 2013 Aug;62(2):162-9. doi: 10.1016/j.annemergmed.2013.02.001. Epub 2013 Mar 30.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验