• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从德国放射学报告中提取信息,用于一般临床文本和语言理解。

Information extraction from German radiological reports for general clinical text and language understanding.

机构信息

Know-Center, 8010, Graz, Austria.

Division of Neuroradiology, Vascular and Interventional Radiology, Department of Radiology, Medical University Graz, 8036, Graz, Austria.

出版信息

Sci Rep. 2023 Feb 9;13(1):2353. doi: 10.1038/s41598-023-29323-3.

DOI:10.1038/s41598-023-29323-3
PMID:36759679
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9911592/
Abstract

Recent advances in deep learning and natural language processing (NLP) have opened many new opportunities for automatic text understanding and text processing in the medical field. This is of great benefit as many clinical downstream tasks rely on information from unstructured clinical documents. However, for low-resource languages like German, the use of modern text processing applications that require a large amount of training data proves to be difficult, as only few data sets are available mainly due to legal restrictions. In this study, we present an information extraction framework that was initially pre-trained on real-world computed tomographic (CT) reports of head examinations, followed by domain adaptive fine-tuning on reports from different imaging examinations. We show that in the pre-training phase, the semantic and contextual meaning of one clinical reporting domain can be captured and effectively transferred to foreign clinical imaging examinations. Moreover, we introduce an active learning approach with an intrinsic strategic sampling method to generate highly informative training data with low human annotation cost. We see that the model performance can be significantly improved by an appropriate selection of the data to be annotated, without the need to train the model on a specific downstream task. With a general annotation scheme that can be used not only in the radiology field but also in a broader clinical setting, we contribute to a more consistent labeling and annotation process that also facilitates the verification and evaluation of language models in the German clinical setting.

摘要

深度学习和自然语言处理(NLP)的最新进展为医学领域的自动文本理解和文本处理开辟了许多新的机会。由于许多临床下游任务依赖于来自非结构化临床文档的信息,因此这非常有益。然而,对于德语等资源较少的语言,使用需要大量训练数据的现代文本处理应用程序证明是困难的,这主要是因为法律限制,可用的数据集很少。在这项研究中,我们提出了一个信息提取框架,该框架最初是在真实世界的头部 CT 检查报告上进行预训练的,然后在来自不同成像检查的报告上进行领域自适应微调。我们表明,在预训练阶段,可以捕获一个临床报告领域的语义和上下文含义,并有效地将其转移到外国临床成像检查中。此外,我们引入了一种主动学习方法,该方法具有内在的策略性采样方法,可以用低人工注释成本生成信息量高的训练数据。我们发现,通过适当选择要注释的数据,可以显著提高模型性能,而无需在特定的下游任务上训练模型。通过一个不仅可以在放射学领域,而且可以在更广泛的临床环境中使用的通用注释方案,我们为更一致的标记和注释过程做出了贡献,这也促进了德语临床环境中语言模型的验证和评估。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8a7/9911592/6a85701abc73/41598_2023_29323_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8a7/9911592/9dd8ed76b541/41598_2023_29323_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8a7/9911592/991b481c4977/41598_2023_29323_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8a7/9911592/8e70cf679ca5/41598_2023_29323_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8a7/9911592/e3a22c1c3149/41598_2023_29323_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8a7/9911592/f7c5d0fcff9d/41598_2023_29323_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8a7/9911592/6a85701abc73/41598_2023_29323_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8a7/9911592/9dd8ed76b541/41598_2023_29323_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8a7/9911592/991b481c4977/41598_2023_29323_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8a7/9911592/8e70cf679ca5/41598_2023_29323_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8a7/9911592/e3a22c1c3149/41598_2023_29323_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8a7/9911592/f7c5d0fcff9d/41598_2023_29323_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8a7/9911592/6a85701abc73/41598_2023_29323_Fig6_HTML.jpg

相似文献

1
Information extraction from German radiological reports for general clinical text and language understanding.从德国放射学报告中提取信息,用于一般临床文本和语言理解。
Sci Rep. 2023 Feb 9;13(1):2353. doi: 10.1038/s41598-023-29323-3.
2
A comparison of word embeddings for the biomedical natural language processing.生物医学自然语言处理中词嵌入的比较。
J Biomed Inform. 2018 Nov;87:12-20. doi: 10.1016/j.jbi.2018.09.008. Epub 2018 Sep 12.
3
An efficient modular framework for automatic LIONC classification of MedIMG using unified medical language.一种使用统一医学语言对医学图像进行自动LIONC分类的高效模块化框架。
Front Public Health. 2022 Aug 10;10:926229. doi: 10.3389/fpubh.2022.926229. eCollection 2022.
4
Information extraction from weakly structured radiological reports with natural language queries.利用自然语言查询从弱结构放射学报告中提取信息。
Eur Radiol. 2024 Jan;34(1):330-337. doi: 10.1007/s00330-023-09977-3. Epub 2023 Jul 28.
5
The 2019 n2c2/OHNLP Track on Clinical Semantic Textual Similarity: Overview.2019年n2c2/OHNLP临床语义文本相似性赛道:概述
JMIR Med Inform. 2020 Nov 27;8(11):e23375. doi: 10.2196/23375.
6
Annotated dataset creation through large language models for non-english medical NLP.通过大型语言模型创建非英语医学自然语言处理的标注数据集。
J Biomed Inform. 2023 Sep;145:104478. doi: 10.1016/j.jbi.2023.104478. Epub 2023 Aug 23.
7
Application of Deep Learning in Generating Structured Radiology Reports: A Transformer-Based Technique.深度学习在生成结构化放射学报告中的应用:基于转换器的技术。
J Digit Imaging. 2023 Feb;36(1):80-90. doi: 10.1007/s10278-022-00692-x. Epub 2022 Aug 24.
8
Natural Language Processing in Radiology: A Systematic Review.自然语言处理在放射学中的应用:系统评价。
Radiology. 2016 May;279(2):329-43. doi: 10.1148/radiol.16142770.
9
Automatic extraction of 12 cardiovascular concepts from German discharge letters using pre-trained language models.使用预训练语言模型从德语出院小结中自动提取12个心血管概念。
Digit Health. 2021 Nov 26;7:20552076211057662. doi: 10.1177/20552076211057662. eCollection 2021 Jan-Dec.
10
Natural Language Processing in Radiology: Update on Clinical Applications.自然语言处理在放射学中的应用:临床应用的更新。
J Am Coll Radiol. 2022 Nov;19(11):1271-1285. doi: 10.1016/j.jacr.2022.06.016. Epub 2022 Aug 25.

引用本文的文献

1
Enhancing Bidirectional Encoder Representations From Transformers (BERT) With Frame Semantics to Extract Clinically Relevant Information From German Mammography Reports: Algorithm Development and Validation.利用框架语义增强来自变换器的双向编码器表征(BERT)以从德国乳腺钼靶报告中提取临床相关信息:算法开发与验证
J Med Internet Res. 2025 Apr 25;27:e68427. doi: 10.2196/68427.
2
Year 2023 in Biomedical Natural Language Processing: a Tribute to Large Language Models and Generative AI.2023年生物医学自然语言处理领域:向大语言模型和生成式人工智能致敬。
Yearb Med Inform. 2024 Aug;33(1):241-248. doi: 10.1055/s-0044-1800751. Epub 2025 Apr 8.
3

本文引用的文献

1
Deep Learning-based Assessment of Oncologic Outcomes from Natural Language Processing of Structured Radiology Reports.基于深度学习的结构化放射学报告自然语言处理对肿瘤学结果的评估
Radiol Artif Intell. 2022 Jul 20;4(5):e220055. doi: 10.1148/ryai.220055. eCollection 2022 Sep.
2
Deep Learning-based detection of psychiatric attributes from German mental health records.基于深度学习的德国心理健康记录中精神属性的检测。
Int J Med Inform. 2022 May;161:104724. doi: 10.1016/j.ijmedinf.2022.104724. Epub 2022 Feb 22.
3
Deep learning in clinical natural language processing: a methodical review.
Diagnosis extraction from unstructured Dutch echocardiogram reports using span- and document-level characteristic classification.
使用跨度和文档级特征分类从非结构化荷兰语超声心动图报告中提取诊断信息。
BMC Med Inform Decis Mak. 2025 Mar 7;25(1):115. doi: 10.1186/s12911-025-02897-w.
4
Deep learning for named entity recognition in Turkish radiology reports.用于土耳其语放射学报告中命名实体识别的深度学习
Diagn Interv Radiol. 2025 Feb 28. doi: 10.4274/dir.2025.243100.
5
Incremental learning algorithm for dynamic evolution of domain specific vocabulary with its stability and plasticity analysis.用于特定领域词汇动态演化的增量学习算法及其稳定性和可塑性分析。
Sci Rep. 2025 Jan 2;15(1):272. doi: 10.1038/s41598-024-78785-6.
6
Efficient labeling of french mammogram reports with MammoBERT.使用 MammoBERT 对法国乳腺 X 光报告进行高效标注。
Sci Rep. 2024 Oct 22;14(1):24842. doi: 10.1038/s41598-024-76369-y.
7
A scoping review of large language model based approaches for information extraction from radiology reports.基于大语言模型从放射学报告中提取信息的方法的范围综述。
NPJ Digit Med. 2024 Aug 24;7(1):222. doi: 10.1038/s41746-024-01219-0.
8
Advancing medical imaging with language models: featuring a spotlight on ChatGPT.利用语言模型推动医学成像发展:聚焦ChatGPT
Phys Med Biol. 2024 May 3;69(10):10TR01. doi: 10.1088/1361-6560/ad387d.
9
Fuzzy information recognition and translation processing in English interpretation based on a generalized maximum likelihood ratio algorithm.
PeerJ Comput Sci. 2024 Jan 31;10:e1668. doi: 10.7717/peerj-cs.1668. eCollection 2024.
深度学习在临床自然语言处理中的应用:系统综述。
J Am Med Inform Assoc. 2020 Mar 1;27(3):457-470. doi: 10.1093/jamia/ocz200.
4
Natural Language Processing of Clinical Notes on Chronic Diseases: Systematic Review.慢性病临床记录的自然语言处理:系统综述
JMIR Med Inform. 2019 Apr 27;7(2):e12239. doi: 10.2196/12239.
5
Natural Language-based Machine Learning Models for the Annotation of Clinical Radiology Reports.基于自然语言的机器学习模型在临床放射学报告标注中的应用。
Radiology. 2018 May;287(2):570-580. doi: 10.1148/radiol.2018171093. Epub 2018 Jan 30.
6
Clinical information extraction applications: A literature review.临床信息提取应用:文献综述。
J Biomed Inform. 2018 Jan;77:34-49. doi: 10.1016/j.jbi.2017.11.011. Epub 2017 Nov 21.
7
Natural language processing systems for capturing and standardizing unstructured clinical information: A systematic review.用于捕获和标准化非结构化临床信息的自然语言处理系统:一项系统综述。
J Biomed Inform. 2017 Sep;73:14-29. doi: 10.1016/j.jbi.2017.07.012. Epub 2017 Jul 17.
8
Correlating mammographic and pathologic findings in clinical decision support using natural language processing and data mining methods.在临床决策支持中使用自然语言处理和数据挖掘方法关联乳腺钼靶检查和病理检查结果。
Cancer. 2017 Jan 1;123(1):114-121. doi: 10.1002/cncr.30245. Epub 2016 Aug 29.
9
Natural Language Processing in Radiology: A Systematic Review.自然语言处理在放射学中的应用:系统评价。
Radiology. 2016 May;279(2):329-43. doi: 10.1148/radiol.16142770.
10
2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text.2010 i2b2/VA 挑战赛:临床文本中的概念、断言和关系
J Am Med Inform Assoc. 2011 Sep-Oct;18(5):552-6. doi: 10.1136/amiajnl-2011-000203. Epub 2011 Jun 16.