• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

自然语言处理和机器学习算法识别急性缺血性脑卒中的脑部 MRI 报告。

Natural language processing and machine learning algorithm to identify brain MRI reports with acute ischemic stroke.

机构信息

Department of Neurology, Hallym University College of Medicine, Chuncheon, Korea.

Medical University of South Carolina, Charleston, South Carolina, United States of America.

出版信息

PLoS One. 2019 Feb 28;14(2):e0212778. doi: 10.1371/journal.pone.0212778. eCollection 2019.

DOI:10.1371/journal.pone.0212778
PMID:30818342
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6394972/
Abstract

BACKGROUND AND PURPOSE

This project assessed performance of natural language processing (NLP) and machine learning (ML) algorithms for classification of brain MRI radiology reports into acute ischemic stroke (AIS) and non-AIS phenotypes.

MATERIALS AND METHODS

All brain MRI reports from a single academic institution over a two year period were randomly divided into 2 groups for ML: training (70%) and testing (30%). Using "quanteda" NLP package, all text data were parsed into tokens to create the data frequency matrix. Ten-fold cross-validation was applied for bias correction of the training set. Labeling for AIS was performed manually, identifying clinical notes. We applied binary logistic regression, naïve Bayesian classification, single decision tree, and support vector machine for the binary classifiers, and we assessed performance of the algorithms by F1-measure. We also assessed how n-grams or term frequency-inverse document frequency weighting affected the performance of the algorithms.

RESULTS

Of all 3,204 brain MRI documents, 432 (14.3%) were labeled as AIS. AIS documents were longer in character length than those of non-AIS (median [interquartile range]; 551 [377-681] vs. 309 [164-396]). Of all ML algorithms, single decision tree had the highest F1-measure (93.2) and accuracy (98.0%). Adding bigrams to the ML model improved F1-mesaure of naïve Bayesian classification, but not in others, and term frequency-inverse document frequency weighting to data frequency matrix did not show any additional performance improvements.

CONCLUSIONS

Supervised ML based NLP algorithms are useful for automatic classification of brain MRI reports for identification of AIS patients. Single decision tree was the best classifier to identify brain MRI reports with AIS.

摘要

背景与目的

本项目评估了自然语言处理(NLP)和机器学习(ML)算法在将脑部 MRI 放射学报告分类为急性缺血性中风(AIS)和非 AIS 表型方面的性能。

材料与方法

将单一学术机构两年内的所有脑部 MRI 报告随机分为两组进行 ML:训练组(70%)和测试组(30%)。使用“quanteda”NLP 包,将所有文本数据解析成标记以创建数据频率矩阵。对训练集进行了 10 倍交叉验证以校正偏差。通过手动识别临床记录来对 AIS 进行标记。我们应用了二项逻辑回归、朴素贝叶斯分类、单决策树和支持向量机作为二分类器,并通过 F1 分数评估了算法的性能。我们还评估了 n-gram 或词频-逆文档频率加权如何影响算法的性能。

结果

在所有 3204 份脑部 MRI 文档中,有 432 份(14.3%)被标记为 AIS。AIS 文档的字符长度长于非 AIS 文档(中位数[四分位距];551[377-681]比 309[164-396])。在所有 ML 算法中,单决策树的 F1 分数(93.2)和准确率(98.0%)最高。在 ML 模型中添加二项式可提高朴素贝叶斯分类的 F1 分数,但其他算法则不行,并且向数据频率矩阵添加词频-逆文档频率加权并没有显示出任何额外的性能改进。

结论

基于监督学习的 NLP 算法可用于自动分类脑部 MRI 报告以识别 AIS 患者。单决策树是识别 AIS 脑部 MRI 报告的最佳分类器。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/992c/6394972/326dac493524/pone.0212778.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/992c/6394972/df2f926af387/pone.0212778.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/992c/6394972/f0d02aa38ea9/pone.0212778.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/992c/6394972/9ae0debf306c/pone.0212778.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/992c/6394972/326dac493524/pone.0212778.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/992c/6394972/df2f926af387/pone.0212778.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/992c/6394972/f0d02aa38ea9/pone.0212778.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/992c/6394972/9ae0debf306c/pone.0212778.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/992c/6394972/326dac493524/pone.0212778.g004.jpg

相似文献

1
Natural language processing and machine learning algorithm to identify brain MRI reports with acute ischemic stroke.自然语言处理和机器学习算法识别急性缺血性脑卒中的脑部 MRI 报告。
PLoS One. 2019 Feb 28;14(2):e0212778. doi: 10.1371/journal.pone.0212778. eCollection 2019.
2
Prediction of Stroke Outcome Using Natural Language Processing-Based Machine Learning of Radiology Report of Brain MRI.使用基于自然语言处理的脑磁共振成像放射学报告机器学习预测卒中结局
J Pers Med. 2020 Dec 16;10(4):286. doi: 10.3390/jpm10040286.
3
Performance of a Machine Learning Classifier of Knee MRI Reports in Two Large Academic Radiology Practices: A Tool to Estimate Diagnostic Yield.在两家大型学术放射科实践中膝关节MRI报告的机器学习分类器性能:一种估计诊断率的工具
AJR Am J Roentgenol. 2017 Apr;208(4):750-753. doi: 10.2214/AJR.16.16128. Epub 2017 Jan 31.
4
Automated Radiology-Arthroscopy Correlation of Knee Meniscal Tears Using Natural Language Processing Algorithms.使用自然语言处理算法实现膝关节半月板撕裂的放射-关节镜自动关联。
Acad Radiol. 2022 Apr;29(4):479-487. doi: 10.1016/j.acra.2021.01.017. Epub 2021 Feb 11.
5
Automated Outcome Classification of Computed Tomography Imaging Reports for Pediatric Traumatic Brain Injury.小儿创伤性脑损伤计算机断层扫描成像报告的自动结果分类
Acad Emerg Med. 2016 Feb;23(2):171-8. doi: 10.1111/acem.12859. Epub 2016 Jan 14.
6
Integrating Natural Language Processing and Machine Learning Algorithms to Categorize Oncologic Response in Radiology Reports.将自然语言处理和机器学习算法集成到放射学报告中的肿瘤反应分类中。
J Digit Imaging. 2018 Apr;31(2):178-184. doi: 10.1007/s10278-017-0027-x.
7
Automatic Determination of the Need for Intravenous Contrast in Musculoskeletal MRI Examinations Using IBM Watson's Natural Language Processing Algorithm.使用 IBM Watson 的自然语言处理算法自动确定肌肉骨骼 MRI 检查中是否需要静脉造影。
J Digit Imaging. 2018 Apr;31(2):245-251. doi: 10.1007/s10278-017-0021-3.
8
Machine Learning for Detecting Early Infarction in Acute Stroke with Non-Contrast-enhanced CT.基于非增强 CT 的机器学习检测急性卒中早期梗死。
Radiology. 2020 Mar;294(3):638-644. doi: 10.1148/radiol.2020191193. Epub 2020 Jan 28.
9
Medical subdomain classification of clinical notes using a machine learning-based natural language processing approach.基于机器学习的自然语言处理方法对临床笔记进行医学子域分类。
BMC Med Inform Decis Mak. 2017 Dec 1;17(1):155. doi: 10.1186/s12911-017-0556-8.
10
Comparison of an Ensemble of Machine Learning Models and the BERT Language Model for Analysis of Text Descriptions of Brain CT Reports to Determine the Presence of Intracranial Hemorrhage.基于机器学习模型集成与 BERT 语言模型的脑 CT 报告文本描述分析用于判断颅内出血的比较研究
Sovrem Tekhnologii Med. 2024;16(1):27-34. doi: 10.17691/stm2024.16.1.03. Epub 2024 Feb 28.

引用本文的文献

1
Enhancing predictive analytics in mandibular third molar extraction using artificial intelligence: A CBCT-Based study.利用人工智能增强下颌第三磨牙拔除术中的预测分析:一项基于锥形束计算机断层扫描的研究。
Saudi Dent J. 2024 Dec;36(12):1582-1587. doi: 10.1016/j.sdentj.2024.11.007. Epub 2024 Nov 26.
2
Language dysfunction as a primary feature of cognitive decline in neurological populations.语言功能障碍作为神经疾病人群认知衰退的主要特征。
J Neural Transm (Vienna). 2025 Sep 6. doi: 10.1007/s00702-025-03015-w.
3
Real-time data in cancer registries: Validation of an automated data extraction system.

本文引用的文献

1
Performance of machine learning methods in diagnosing Parkinson's disease based on dysphonia measures.基于嗓音障碍测量的机器学习方法在帕金森病诊断中的性能
Biomed Eng Lett. 2017 Oct 12;8(1):29-39. doi: 10.1007/s13534-017-0051-2. eCollection 2018 Feb.
2
Administrative data underestimate acute ischemic stroke events and thrombolysis treatments: Data from a multicenter validation survey in Italy.行政数据低估了急性缺血性脑卒中事件和溶栓治疗:来自意大利多中心验证调查的数据。
PLoS One. 2018 Mar 13;13(3):e0193776. doi: 10.1371/journal.pone.0193776. eCollection 2018.
3
Sex as predictor for achieved health outcomes and received care in ischemic stroke and intracerebral hemorrhage: a register-based study.
癌症登记处的实时数据:自动数据提取系统的验证
iScience. 2025 Jul 3;28(8):113056. doi: 10.1016/j.isci.2025.113056. eCollection 2025 Aug 15.
4
Early Disease Prediction Using a Text-Numerical Hybrid Model Using Large-Scale Clinical Real-World Data.使用文本-数值混合模型和大规模临床真实世界数据进行疾病早期预测。
AMIA Annu Symp Proc. 2025 May 22;2024:885-893. eCollection 2024.
5
Optimization of Radiology Diagnostic Services for Patients with Stroke in Multidisciplinary Hospitals.多学科医院中卒中患者放射诊断服务的优化
Mater Sociomed. 2024;36(2):160-172. doi: 10.5455/msm.2024.36.160-172.
6
TECRR: a benchmark dataset of radiological reports for BI-RADS classification with machine learning, deep learning, and large language model baselines.TECRR:一个基于机器学习、深度学习和大语言模型基线的用于 BI-RADS 分类的放射学报告基准数据集。
BMC Med Inform Decis Mak. 2024 Oct 24;24(1):310. doi: 10.1186/s12911-024-02717-7.
7
Development of a natural language processing algorithm for the detection of spinal metastasis based on magnetic resonance imaging reports.基于磁共振成像报告开发用于检测脊柱转移瘤的自然语言处理算法。
N Am Spine Soc J. 2024 Jul 3;19:100513. doi: 10.1016/j.xnsj.2024.100513. eCollection 2024 Sep.
8
Extraction of Radiological Characteristics From Free-Text Imaging Reports Using Natural Language Processing Among Patients With Ischemic and Hemorrhagic Stroke: Algorithm Development and Validation.使用自然语言处理从缺血性和出血性中风患者的自由文本影像报告中提取放射学特征:算法开发与验证
JMIR AI. 2023 Jun 6;2:e42884. doi: 10.2196/42884.
9
AI-Assisted Summarization of Radiologic Reports: Evaluating GPT3davinci, BARTcnn, LongT5booksum, LEDbooksum, LEDlegal, and LEDclinical.放射学报告的人工智能辅助摘要:评估GPT3davinci、BARTcnn、LongT5booksum、LEDbooksum、LEDlegal和LEDclinical。
AJNR Am J Neuroradiol. 2024 Feb 7;45(2):244-248. doi: 10.3174/ajnr.A8102.
10
Applying Natural Language Processing to Textual Data From Clinical Data Warehouses: Systematic Review.将自然语言处理应用于临床数据仓库中的文本数据:系统评价。
JMIR Med Inform. 2023 Dec 15;11:e42477. doi: 10.2196/42477.
性别对缺血性卒中和脑出血患者健康结局和所接受治疗的预测作用:基于登记的研究。
Biol Sex Differ. 2018 Mar 7;9(1):11. doi: 10.1186/s13293-018-0170-1.
4
Applying natural language processing techniques to develop a task-specific EMR interface for timely stroke thrombolysis: A feasibility study.应用自然语言处理技术开发特定任务的电子病历接口以实现及时的中风溶栓:一项可行性研究。
Int J Med Inform. 2018 Apr;112:149-157. doi: 10.1016/j.ijmedinf.2018.02.005. Epub 2018 Feb 8.
5
Stroke Incidence by Major Pathological Type and Ischemic Subtypes in the Auckland Regional Community Stroke Studies: Changes Between 2002 and 2011.奥克兰地区社区卒中研究中主要病理类型和缺血亚型的卒中发生率:2002 年至 2011 年之间的变化。
Stroke. 2018 Jan;49(1):3-10. doi: 10.1161/STROKEAHA.117.019358. Epub 2017 Dec 6.
6
Integrating Natural Language Processing and Machine Learning Algorithms to Categorize Oncologic Response in Radiology Reports.将自然语言处理和机器学习算法集成到放射学报告中的肿瘤反应分类中。
J Digit Imaging. 2018 Apr;31(2):178-184. doi: 10.1007/s10278-017-0027-x.
7
Comparing deep neural network and other machine learning algorithms for stroke prediction in a large-scale population-based electronic medical claims database.在一个基于大规模人群的电子医疗理赔数据库中,比较深度神经网络和其他机器学习算法用于中风预测的情况。
Annu Int Conf IEEE Eng Med Biol Soc. 2017 Jul;2017:3110-3113. doi: 10.1109/EMBC.2017.8037515.
8
Brain ischemia: CT and MRI techniques in acute ischemic stroke.脑缺血:急性缺血性脑卒中的 CT 和 MRI 技术。
Eur J Radiol. 2017 Nov;96:162-172. doi: 10.1016/j.ejrad.2017.08.014. Epub 2017 Aug 24.
9
Sex Differences in Outcomes after Stroke in Patients with Diabetes in Ontario, Canada.加拿大安大略省糖尿病患者中风后结局的性别差异
J Stroke Cerebrovasc Dis. 2018 Jan;27(1):210-220. doi: 10.1016/j.jstrokecerebrovasdis.2017.08.028. Epub 2017 Sep 13.
10
Natural language processing systems for capturing and standardizing unstructured clinical information: A systematic review.用于捕获和标准化非结构化临床信息的自然语言处理系统:一项系统综述。
J Biomed Inform. 2017 Sep;73:14-29. doi: 10.1016/j.jbi.2017.07.012. Epub 2017 Jul 17.