• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

自动预测主要心血管事件复发:使用胸部 X 光报告进行的文本挖掘研究。

Automatic Prediction of Recurrence of Major Cardiovascular Events: A Text Mining Study Using Chest X-Ray Reports.

机构信息

Department of Methodology and Statistics, Faculty of Social Sciences, Utrecht University, Utrecht, Netherlands.

Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, Netherlands.

出版信息

J Healthc Eng. 2021 Jul 9;2021:6663884. doi: 10.1155/2021/6663884. eCollection 2021.

DOI:10.1155/2021/6663884
PMID:34306597
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8285182/
Abstract

METHODS

We used EHR data of patients included in the Second Manifestations of ARTerial disease (SMART) study. We propose a deep learning-based multimodal architecture for our text mining pipeline that integrates neural text representation with preprocessed clinical predictors for the prediction of recurrence of major cardiovascular events in cardiovascular patients. Text preprocessing, including cleaning and stemming, was first applied to filter out the unwanted texts from X-ray radiology reports. Thereafter, text representation methods were used to numerically represent unstructured radiology reports with vectors. Subsequently, these text representation methods were added to prediction models to assess their clinical relevance. In this step, we applied logistic regression, support vector machine (SVM), multilayer perceptron neural network, convolutional neural network, long short-term memory (LSTM), and bidirectional LSTM deep neural network (BiLSTM).

RESULTS

We performed various experiments to evaluate the added value of the text in the prediction of major cardiovascular events. The two main scenarios were the integration of radiology reports (1) with classical clinical predictors and (2) with only age and sex in the case of unavailable clinical predictors. In total, data of 5603 patients were used with 5-fold cross-validation to train the models. In the first scenario, the multimodal BiLSTM (MI-BiLSTM) model achieved an area under the curve (AUC) of 84.7%, misclassification rate of 14.3%, and F1 score of 83.8%. In this scenario, the SVM model, trained on clinical variables and bag-of-words representation, achieved the lowest misclassification rate of 12.2%. In the case of unavailable clinical predictors, the MI-BiLSTM model trained on radiology reports and demographic (age and sex) variables reached an AUC, F1 score, and misclassification rate of 74.5%, 70.8%, and 20.4%, respectively.

CONCLUSIONS

Using the case study of routine care chest X-ray radiology reports, we demonstrated the clinical relevance of integrating text features and classical predictors in our text mining pipeline for cardiovascular risk prediction. The MI-BiLSTM model with word embedding representation appeared to have a desirable performance when trained on text data integrated with the clinical variables from the SMART study. Our results mined from chest X-ray reports showed that models using text data in addition to laboratory values outperform those using only known clinical predictors.

摘要

方法

我们使用了包含在第二动脉粥样硬化表现研究(SMART)中的患者电子病历数据。我们提出了一种基于深度学习的多模态架构,用于我们的文本挖掘管道,该架构将神经文本表示与预处理的临床预测因子相结合,用于预测心血管患者主要心血管事件的复发。文本预处理,包括清理和词干化,首先应用于从 X 射线放射学报告中过滤掉不需要的文本。此后,使用文本表示方法用向量表示非结构化放射学报告。随后,将这些文本表示方法添加到预测模型中,以评估它们的临床相关性。在这一步中,我们应用了逻辑回归、支持向量机(SVM)、多层感知机神经网络、卷积神经网络、长短期记忆(LSTM)和双向长短期记忆神经网络(BiLSTM)。

结果

我们进行了各种实验来评估文本在预测主要心血管事件中的附加价值。两个主要场景是:(1)将放射学报告与经典临床预测因子结合,(2)在临床预测因子不可用时仅将年龄和性别与放射学报告结合。总共使用了 5603 名患者的数据,并进行了 5 折交叉验证来训练模型。在第一个场景中,多模态 BiLSTM(MI-BiLSTM)模型达到了 84.7%的曲线下面积(AUC)、14.3%的误分类率和 83.8%的 F1 分数。在这个场景中,基于临床变量和词袋表示的 SVM 模型达到了最低的误分类率 12.2%。在临床预测因子不可用时,基于放射学报告和人口统计学(年龄和性别)变量训练的 MI-BiLSTM 模型达到了 74.5%、70.8%和 20.4%的 AUC、F1 分数和误分类率。

结论

使用常规护理胸部 X 射线放射学报告的案例研究,我们证明了在心血管风险预测的文本挖掘管道中整合文本特征和经典预测因子的临床相关性。在将 SMART 研究的临床变量与文本数据相结合的情况下,基于词嵌入表示的 MI-BiLSTM 模型表现出了良好的性能。我们从胸部 X 射线报告中挖掘出的结果表明,使用文本数据和实验室值的模型优于仅使用已知临床预测因子的模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0f9/8285182/85a07d98da66/JHE2021-6663884.006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0f9/8285182/3b663d7ec651/JHE2021-6663884.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0f9/8285182/fdfe27113d90/JHE2021-6663884.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0f9/8285182/089f572f9035/JHE2021-6663884.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0f9/8285182/d2ad0fe36b09/JHE2021-6663884.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0f9/8285182/25cb0a53565d/JHE2021-6663884.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0f9/8285182/85a07d98da66/JHE2021-6663884.006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0f9/8285182/3b663d7ec651/JHE2021-6663884.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0f9/8285182/fdfe27113d90/JHE2021-6663884.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0f9/8285182/089f572f9035/JHE2021-6663884.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0f9/8285182/d2ad0fe36b09/JHE2021-6663884.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0f9/8285182/25cb0a53565d/JHE2021-6663884.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0f9/8285182/85a07d98da66/JHE2021-6663884.006.jpg

相似文献

1
Automatic Prediction of Recurrence of Major Cardiovascular Events: A Text Mining Study Using Chest X-Ray Reports.自动预测主要心血管事件复发:使用胸部 X 光报告进行的文本挖掘研究。
J Healthc Eng. 2021 Jul 9;2021:6663884. doi: 10.1155/2021/6663884. eCollection 2021.
2
Automated Classification of Free-Text Radiology Reports: Using Different Feature Extraction Methods to Identify Fractures of the Distal Fibula.自动化自由文本放射学报告分类:使用不同的特征提取方法识别腓骨远端骨折。
Rofo. 2023 Aug;195(8):713-719. doi: 10.1055/a-2061-6562. Epub 2023 May 9.
3
Automatic text classification of actionable radiology reports of tinnitus patients using bidirectional encoder representations from transformer (BERT) and in-domain pre-training (IDPT).使用基于转换器的双向编码器表示 (BERT) 和领域内预训练 (IDPT) 对耳鸣患者的可操作放射学报告进行自动文本分类。
BMC Med Inform Decis Mak. 2022 Jul 30;22(1):200. doi: 10.1186/s12911-022-01946-y.
4
Prediction of Stroke Outcome Using Natural Language Processing-Based Machine Learning of Radiology Report of Brain MRI.使用基于自然语言处理的脑磁共振成像放射学报告机器学习预测卒中结局
J Pers Med. 2020 Dec 16;10(4):286. doi: 10.3390/jpm10040286.
5
Natural language processing in the classification of radiology reports in benign gallbladder diseases.自然语言处理在良性胆囊疾病放射学报告分类中的应用
Radiol Bras. 2024 Jun 26;57:e20230096en. doi: 10.1590/0100-3984.2023.0096-en. eCollection 2024 Jan-Dec.
6
Adverse Drug Event Detection from Electronic Health Records Using Hierarchical Recurrent Neural Networks with Dual-Level Embedding.基于具有双层嵌入的层次递归神经网络从电子健康记录中检测药物不良反应。
Drug Saf. 2019 Jan;42(1):113-122. doi: 10.1007/s40264-018-0765-9.
7
CapsTM: capsule network for Chinese medical text matching.CapsTM:用于中文医疗文本匹配的胶囊网络。
BMC Med Inform Decis Mak. 2021 Jul 30;21(Suppl 2):94. doi: 10.1186/s12911-021-01442-9.
8
Neural classification of Norwegian radiology reports: using NLP to detect findings in CT-scans of children.挪威放射学报告的神经分类:使用自然语言处理技术检测儿童 CT 扫描结果。
BMC Med Inform Decis Mak. 2021 Mar 4;21(1):84. doi: 10.1186/s12911-021-01451-8.
9
Emotion Analysis Model of Microblog Comment Text Based on CNN-BiLSTM.基于 CNN-BiLSTM 的微博评论情感分析模型。
Comput Intell Neurosci. 2022 Apr 30;2022:1669569. doi: 10.1155/2022/1669569. eCollection 2022.
10
DAFA-BiLSTM: Deep Autoregression Feature Augmented Bidirectional LSTM network for time series prediction.DAFA-BiLSTM:用于时间序列预测的深度自回归特征增强双向 LSTM 网络。
Neural Netw. 2023 Jan;157:240-256. doi: 10.1016/j.neunet.2022.10.009. Epub 2022 Oct 14.

引用本文的文献

1
Hybrid deep spatial and statistical feature fusion for accurate MRI brain tumor classification.用于精确MRI脑肿瘤分类的混合深度空间与统计特征融合
Front Comput Neurosci. 2024 Jun 24;18:1423051. doi: 10.3389/fncom.2024.1423051. eCollection 2024.

本文引用的文献

1
Automatic multilabel detection of ICD10 codes in Dutch cardiology discharge letters using neural networks.使用神经网络自动多标签检测荷兰心脏病学出院小结中的ICD10编码
NPJ Digit Med. 2021 Feb 26;4(1):37. doi: 10.1038/s41746-021-00404-9.
2
Artificial Intelligence-Based Multimodal Risk Assessment Model for Surgical Site Infection (AMRAMS): Development and Validation Study.基于人工智能的手术部位感染多模态风险评估模型(AMRAMS):开发与验证研究
JMIR Med Inform. 2020 Jun 15;8(6):e18186. doi: 10.2196/18186.
3
Deep learning in generating radiology reports: A survey.
深度学习在生成放射学报告中的应用:综述。
Artif Intell Med. 2020 Jun;106:101878. doi: 10.1016/j.artmed.2020.101878. Epub 2020 May 15.
4
Supervised and unsupervised language modelling in Chest X-Ray radiological reports.在胸部 X 光报告中进行有监督和无监督的语言建模。
PLoS One. 2020 Mar 10;15(3):e0229963. doi: 10.1371/journal.pone.0229963. eCollection 2020.
5
Identification of patients with carotid stenosis using natural language processing.使用自然语言处理识别颈动脉狭窄患者。
Eur Radiol. 2020 Jul;30(7):4125-4133. doi: 10.1007/s00330-020-06721-z. Epub 2020 Feb 26.
6
Boosting ICD multi-label classification of health records with contextual embeddings and label-granularity.利用上下文嵌入和标签粒度增强 ICD 多标签健康记录分类。
Comput Methods Programs Biomed. 2020 May;188:105264. doi: 10.1016/j.cmpb.2019.105264. Epub 2019 Dec 10.
7
Eliminating biasing signals in lung cancer images for prognosis predictions with deep learning.通过深度学习消除肺癌图像中的偏差信号以进行预后预测。
NPJ Digit Med. 2019 Dec 10;2:122. doi: 10.1038/s41746-019-0194-x. eCollection 2019.
8
Text mining brain imaging reports.文本挖掘脑成像报告。
J Biomed Semantics. 2019 Nov 12;10(Suppl 1):23. doi: 10.1186/s13326-019-0211-7.
9
Deep contextualized embeddings for quantifying the informative content in biomedical text summarization.用于量化生物医学文本摘要是信息内容的深度语境化嵌入。
Comput Methods Programs Biomed. 2020 Feb;184:105117. doi: 10.1016/j.cmpb.2019.105117. Epub 2019 Oct 4.
10
An empirical evaluation of deep learning for ICD-9 code assignment using MIMIC-III clinical notes.基于 MIMIC-III 临床记录的深度学习方法在 ICD-9 编码任务中的实证评估
Comput Methods Programs Biomed. 2019 Aug;177:141-153. doi: 10.1016/j.cmpb.2019.05.024. Epub 2019 May 25.