• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于BERT的法语CT报告自然语言处理分析:在肺栓塞阳性率测量中的应用

BERT-based natural language processing analysis of French CT reports: Application to the measurement of the positivity rate for pulmonary embolism.

作者信息

Jupin-Delevaux Émilien, Djahnine Aissam, Talbot François, Richard Antoine, Gouttard Sylvain, Mansuy Adeline, Douek Philippe, Si-Mohamed Salim, Boussel Loïc

机构信息

Radiology department, Hospices Civils de Lyon - HCL, Lyon, France.

CREATIS, Univ Lyon, INSA-Lyon, Université Claude Bernard Lyon 1, UJM-Saint Etienne, CNRS, Inserm, CREATIS UMR 5220, U1294, Lyon, France.

出版信息

Res Diagn Interv Imaging. 2023 Mar 27;6:100027. doi: 10.1016/j.redii.2023.100027. eCollection 2023 Jun.

DOI:10.1016/j.redii.2023.100027
PMID:39077547
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11265488/
Abstract

RATIONALE AND OBJECTIVES

To develop a Natural Language Processing (NLP) method based on Bidirectional Encoder Representations from Transformers (BERT) adapted to French CT reports and to evaluate its performance to calculate the diagnostic yield of CT in patients with clinical suspicion of pulmonary embolism (PE).

MATERIALS AND METHODS

All the CT reports performed in our institution in 2019 (99,510 reports, training and validation dataset) and 2018 (94,559 reports, testing dataset) were included after anonymization. Two BERT-based NLP sentence classifiers were trained on 27.700, manually labeled, sentences from the training dataset. The first one aimed to classify the reports' sentences into three classes ("Non chest", "Healthy chest", and "Pathological chest" related sentences), the second one to classify the last class into eleven sub classes pathologies including "pulmonary embolism". F1-score was reported on the validation dataset. These NLP classifiers were then applied to requested CT reports for pulmonary embolism from the testing dataset. Sensitivity, specificity, and accuracy for detection of the presence of a pulmonary embolism were reported in comparison to human analysis of the reports.

RESULTS

The F1-score for the 3-Classes and 11-SubClasses classifiers was 0.984 and 0.985, respectively. 4,042 examinations from the testing dataset were requested for pulmonary embolism of which 641 (15.8%) were positively evaluated by radiologists. The sensitivity, specificity, and accuracy of the NLP network for identifying pulmonary embolism in these reports were 98.2%, 99.3% and 99.1%, respectively.

CONCLUSION

BERT-based NLP sentences classifier enables the analysis of large databases of radiological reports to accurately determine the diagnostic yield of CT screening.

摘要

原理与目的

开发一种基于变换器双向编码器表征(BERT)的自然语言处理(NLP)方法,使其适用于法语CT报告,并评估其在计算临床怀疑肺栓塞(PE)患者CT诊断率方面的性能。

材料与方法

对2019年(99510份报告,训练和验证数据集)及2018年(94559份报告,测试数据集)在本机构进行的所有CT报告进行匿名化处理后纳入研究。在来自训练数据集的27700个手动标注句子上训练了两个基于BERT的NLP句子分类器。第一个旨在将报告句子分为三类(“非胸部”、“健康胸部”和“病理性胸部”相关句子),第二个旨在将最后一类分为包括“肺栓塞”在内的11个子类病理。在验证数据集上报告F1分数。然后将这些NLP分类器应用于测试数据集中请求的肺栓塞CT报告。与对报告的人工分析相比,报告了检测肺栓塞存在的敏感性、特异性和准确性。

结果

3类和11子类分类器的F1分数分别为0.984和0.985。测试数据集中有4042例检查请求进行肺栓塞检查,其中641例(15.8%)经放射科医生阳性评估。在这些报告中,NLP网络识别肺栓塞的敏感性、特异性和准确性分别为98.2%、99.3%和99.1%。

结论

基于BERT的NLP句子分类器能够分析大型放射学报告数据库,以准确确定CT筛查的诊断率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fc63/11265488/e6e5044d1502/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fc63/11265488/e6e5044d1502/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fc63/11265488/e6e5044d1502/gr1.jpg

相似文献

1
BERT-based natural language processing analysis of French CT reports: Application to the measurement of the positivity rate for pulmonary embolism.基于BERT的法语CT报告自然语言处理分析:在肺栓塞阳性率测量中的应用
Res Diagn Interv Imaging. 2023 Mar 27;6:100027. doi: 10.1016/j.redii.2023.100027. eCollection 2023 Jun.
2
Machine learning based natural language processing of radiology reports in orthopaedic trauma.基于机器学习的放射科报告自然语言处理在骨科创伤中的应用。
Comput Methods Programs Biomed. 2021 Sep;208:106304. doi: 10.1016/j.cmpb.2021.106304. Epub 2021 Jul 23.
3
Construction of a Multi-Label Classifier for Extracting Multiple Incident Factors From Medication Incident Reports in Residential Care Facilities: Natural Language Processing Approach.构建用于从养老机构用药事件报告中提取多个事件因素的多标签分类器:自然语言处理方法
JMIR Med Inform. 2024 Jul 23;12:e58141. doi: 10.2196/58141.
4
Automatic text classification of actionable radiology reports of tinnitus patients using bidirectional encoder representations from transformer (BERT) and in-domain pre-training (IDPT).使用基于转换器的双向编码器表示 (BERT) 和领域内预训练 (IDPT) 对耳鸣患者的可操作放射学报告进行自动文本分类。
BMC Med Inform Decis Mak. 2022 Jul 30;22(1):200. doi: 10.1186/s12911-022-01946-y.
5
Adapting Bidirectional Encoder Representations from Transformers (BERT) to Assess Clinical Semantic Textual Similarity: Algorithm Development and Validation Study.改编来自Transformer的双向编码器表征(BERT)以评估临床语义文本相似性:算法开发与验证研究。
JMIR Med Inform. 2021 Feb 3;9(2):e22795. doi: 10.2196/22795.
6
A Natural Language Processing Model for COVID-19 Detection Based on Dutch General Practice Electronic Health Records by Using Bidirectional Encoder Representations From Transformers: Development and Validation Study.基于荷兰全科电子健康记录的 COVID-19 检测自然语言处理模型:使用转换器的双向编码器表示进行开发和验证研究。
J Med Internet Res. 2023 Oct 4;25:e49944. doi: 10.2196/49944.
7
Information extraction from weakly structured radiological reports with natural language queries.利用自然语言查询从弱结构放射学报告中提取信息。
Eur Radiol. 2024 Jan;34(1):330-337. doi: 10.1007/s00330-023-09977-3. Epub 2023 Jul 28.
8
Development and External Validation of an Artificial Intelligence Model for Identifying Radiology Reports Containing Recommendations for Additional Imaging.开发和外部验证用于识别包含额外成像建议的放射学报告的人工智能模型。
AJR Am J Roentgenol. 2023 Sep;221(3):377-385. doi: 10.2214/AJR.23.29120. Epub 2023 Apr 19.
9
Towards automated generation of curated datasets in radiology: Application of natural language processing to unstructured reports exemplified on CT for pulmonary embolism.面向放射学中经过策展的数据集的自动化生成:以 CT 肺栓塞影像报告为例的自然语言处理在非结构化报告中的应用。
Eur J Radiol. 2020 Apr;125:108862. doi: 10.1016/j.ejrad.2020.108862. Epub 2020 Feb 6.
10
Use of BERT (Bidirectional Encoder Representations from Transformers)-Based Deep Learning Method for Extracting Evidences in Chinese Radiology Reports: Development of a Computer-Aided Liver Cancer Diagnosis Framework.基于 BERT(来自 Transformers 的双向编码器表示)的深度学习方法在提取中文放射学报告证据中的应用:计算机辅助肝癌诊断框架的开发。
J Med Internet Res. 2021 Jan 12;23(1):e19689. doi: 10.2196/19689.

引用本文的文献

1
Development of a Natural Language Processing Model for Extracting Kidney Biopsy Pathology Diagnoses.用于提取肾活检病理诊断的自然语言处理模型的开发
Kidney Med. 2025 Jun 14;7(8):101047. doi: 10.1016/j.xkme.2025.101047. eCollection 2025 Aug.
2
Using a transformer language model to curate a pulmonary embolism dataset from the Medical Information Mart for Intensive Care IV: MIMIC-IV-Ext-PE.使用变压器语言模型从重症监护医学信息库IV:MIMIC-IV-Ext-PE中筛选出肺栓塞数据集。
Res Pract Thromb Haemost. 2025 May 21;9(4):102896. doi: 10.1016/j.rpth.2025.102896. eCollection 2025 May.
3
A scoping review of large language model based approaches for information extraction from radiology reports.

本文引用的文献

1
Artificial intelligence in emergency radiology: A review of applications and possibilities.急诊放射学中的人工智能:应用与可能性综述
Diagn Interv Imaging. 2023 Jan;104(1):6-10. doi: 10.1016/j.diii.2022.07.005. Epub 2022 Aug 4.
2
How artificial intelligence improves radiological interpretation in suspected pulmonary embolism.人工智能如何提高疑似肺栓塞的放射学解读。
Eur Radiol. 2022 Sep;32(9):5831-5842. doi: 10.1007/s00330-022-08645-2. Epub 2022 Mar 22.
3
A systematic review of natural language processing applied to radiology reports.
基于大语言模型从放射学报告中提取信息的方法的范围综述。
NPJ Digit Med. 2024 Aug 24;7(1):222. doi: 10.1038/s41746-024-01219-0.
4
Performance of an Open-Source Large Language Model in Extracting Information from Free-Text Radiology Reports.开源大语言模型从自由文本放射学报告中提取信息的性能。
Radiol Artif Intell. 2024 Jul;6(4):e230364. doi: 10.1148/ryai.230364.
自然语言处理在放射学报告中的应用的系统评价。
BMC Med Inform Decis Mak. 2021 Jun 3;21(1):179. doi: 10.1186/s12911-021-01533-7.
4
French FastContext: A publicly accessible system for detecting negation, temporality and experiencer in French clinical notes.法语快速上下文:一个用于在法语临床记录中检测否定、时间性和体验者的可公开访问系统。
J Biomed Inform. 2021 May;117:103733. doi: 10.1016/j.jbi.2021.103733. Epub 2021 Mar 15.
5
An Effective BERT-Based Pipeline for Twitter Sentiment Analysis: A Case Study in Italian.基于 BERT 的 Twitter 情感分析有效流水线:意大利语案例研究。
Sensors (Basel). 2020 Dec 28;21(1):133. doi: 10.3390/s21010133.
6
Clinical concept extraction using transformers.使用转换器进行临床概念提取。
J Am Med Inform Assoc. 2020 Dec 9;27(12):1935-1942. doi: 10.1093/jamia/ocaa189.
7
Understanding patient complaint characteristics using contextual clinical BERT embeddings.使用上下文临床BERT嵌入理解患者投诉特征。
Annu Int Conf IEEE Eng Med Biol Soc. 2020 Jul;2020:5847-5850. doi: 10.1109/EMBC44109.2020.9175577.
8
Generating contextual embeddings for emergency department chief complaints.为急诊科主要症状生成上下文嵌入。
JAMIA Open. 2020 Jul 15;3(2):160-166. doi: 10.1093/jamiaopen/ooaa022. eCollection 2020 Jul.
9
Towards automated generation of curated datasets in radiology: Application of natural language processing to unstructured reports exemplified on CT for pulmonary embolism.面向放射学中经过策展的数据集的自动化生成:以 CT 肺栓塞影像报告为例的自然语言处理在非结构化报告中的应用。
Eur J Radiol. 2020 Apr;125:108862. doi: 10.1016/j.ejrad.2020.108862. Epub 2020 Feb 6.
10
ACR Appropriateness Criteria® Headache.ACR 适宜性标准®头痛。
J Am Coll Radiol. 2019 Nov;16(11S):S364-S377. doi: 10.1016/j.jacr.2019.05.030.