Suppr
超能文献

使用基于转换器的自然语言处理方法识别与糖尿病视网膜病变相关的临床概念及其属性。

Identify diabetic retinopathy-related clinical concepts and their attributes using transformer-based natural language processing methods.

机构信息

Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA.

Department of Biomedical Engineering, College of Engineering, University of Florida, Gainesville, FL, USA.

出版信息

BMC Med Inform Decis Mak. 2022 Sep 27;22(Suppl 3):255. doi: 10.1186/s12911-022-01996-2.

DOI:10.1186/s12911-022-01996-2

PMID:36167551

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9513862/

Abstract

BACKGROUND

Diabetic retinopathy (DR) is a leading cause of blindness in American adults. If detected, DR can be treated to prevent further damage causing blindness. There is an increasing interest in developing artificial intelligence (AI) technologies to help detect DR using electronic health records. The lesion-related information documented in fundus image reports is a valuable resource that could help diagnoses of DR in clinical decision support systems. However, most studies for AI-based DR diagnoses are mainly based on medical images; there is limited studies to explore the lesion-related information captured in the free text image reports.

METHODS

In this study, we examined two state-of-the-art transformer-based natural language processing (NLP) models, including BERT and RoBERTa, compared them with a recurrent neural network implemented using Long short-term memory (LSTM) to extract DR-related concepts from clinical narratives. We identified four different categories of DR-related clinical concepts including lesions, eye parts, laterality, and severity, developed annotation guidelines, annotated a DR-corpus of 536 image reports, and developed transformer-based NLP models for clinical concept extraction and relation extraction. We also examined the relation extraction under two settings including 'gold-standard' setting-where gold-standard concepts were used-and end-to-end setting.

RESULTS

For concept extraction, the BERT model pretrained with the MIMIC III dataset achieve the best performance (0.9503 and 0.9645 for strict/lenient evaluation). For relation extraction, BERT model pretrained using general English text achieved the best strict/lenient F1-score of 0.9316. The end-to-end system, BERT_general_e2e, achieved the best strict/lenient F1-score of 0.8578 and 0.8881, respectively. Another end-to-end system based on the RoBERTa architecture, RoBERTa_general_e2e, also achieved the same performance as BERT_general_e2e in strict scores.

CONCLUSIONS

This study demonstrated the efficiency of transformer-based NLP models for clinical concept extraction and relation extraction. Our results show that it's necessary to pretrain transformer models using clinical text to optimize the performance for clinical concept extraction. Whereas, for relation extraction, transformers pretrained using general English text perform better.

摘要

背景

糖尿病视网膜病变（DR）是美国成年人致盲的主要原因。如果及早发现，DR 可以得到治疗，以防止进一步损害导致失明。人们越来越有兴趣开发人工智能（AI）技术，以帮助使用电子健康记录来检测 DR。眼底图像报告中记录的病变相关信息是一种有价值的资源，可帮助临床决策支持系统诊断 DR。然而，大多数基于 AI 的 DR 诊断研究主要基于医学图像；很少有研究探索自由文本图像报告中捕获的病变相关信息。

方法

在这项研究中，我们检查了两个基于转换器的最先进的自然语言处理（NLP）模型，包括 BERT 和 RoBERTa，将它们与使用长短期记忆（LSTM）实现的循环神经网络进行了比较，以从临床叙述中提取 DR 相关概念。我们确定了四个不同类别的 DR 相关临床概念，包括病变、眼部部位、侧别和严重程度，制定了注释指南，对 536 份图像报告的 DR 语料库进行了注释，并为临床概念提取和关系提取开发了基于转换器的 NLP 模型。我们还检查了两种设置下的关系提取，包括“黄金标准”设置-使用黄金标准概念和端到端设置。

结果

对于概念提取，使用 MIMIC III 数据集预训练的 BERT 模型表现最佳（严格/宽松评估的 0.9503 和 0.9645）。对于关系提取，使用一般英语文本预训练的 BERT 模型实现了最佳的严格/宽松 F1 分数 0.9316。端到端系统 BERT_general_e2e 分别实现了最佳的严格/宽松 F1 分数 0.8578 和 0.8881。基于 RoBERTa 架构的另一个端到端系统 RoBERTa_general_e2e，在严格评分中也取得了与 BERT_general_e2e 相同的性能。

结论

本研究证明了基于转换器的 NLP 模型在临床概念提取和关系提取方面的效率。我们的结果表明，有必要使用临床文本预训练转换器模型，以优化临床概念提取的性能。然而，对于关系提取，使用一般英语文本预训练的转换器表现更好。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e438/9513862/83582039d569/12911_2022_1996_Fig1_HTML.jpg

相似文献

Identify diabetic retinopathy-related clinical concepts and their attributes using transformer-based natural language processing methods.

BMC Med Inform Decis Mak. 2022 Sep 27;22(Suppl 3):255. doi: 10.1186/s12911-022-01996-2.

A Study of Social and Behavioral Determinants of Health in Lung Cancer Patients Using Transformers-based Natural Language Processing Models.

AMIA Annu Symp Proc. 2022 Feb 21;2021:1225-1233. eCollection 2021.

Extraction of sleep information from clinical notes of Alzheimer's disease patients using natural language processing.

J Am Med Inform Assoc. 2024 Oct 1;31(10):2217-2227. doi: 10.1093/jamia/ocae177.

Optical coherence tomography (OCT) for detection of macular oedema in patients with diabetic retinopathy.

Cochrane Database Syst Rev. 2015 Jan 7;1(1):CD008081. doi: 10.1002/14651858.CD008081.pub3.

Trajectory-Ordered Objectives for Self-Supervised Representation Learning of Temporal Healthcare Data Using Transformers: Model Development and Evaluation Study.

JMIR Med Inform. 2025 Jun 4;13:e68138. doi: 10.2196/68138.

Optical coherence tomography (OCT) for detection of macular oedema in patients with diabetic retinopathy.

Cochrane Database Syst Rev. 2011 Jul 6(7):CD008081. doi: 10.1002/14651858.CD008081.pub2.

Multicriteria Optimization of Language Models for Heart Failure With Preserved Ejection Fraction Symptom Detection in Spanish Electronic Health Records: Comparative Modeling Study.

J Med Internet Res. 2025 Jul 17;27:e76433. doi: 10.2196/76433.

Use of deep learning-based NLP models for full-text data elements extraction for systematic literature review tasks.

Sci Rep. 2025 Jun 3;15(1):19379. doi: 10.1038/s41598-025-03979-5.

Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.

Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

Clinical concept and relation extraction using prompt-based machine reading comprehension.

J Am Med Inform Assoc. 2023 Aug 18;30(9):1486-1493. doi: 10.1093/jamia/ocad107.

引用本文的文献

Large language models in the management of chronic ocular diseases: a scoping review.

Front Cell Dev Biol. 2025 Jun 18;13:1608988. doi: 10.3389/fcell.2025.1608988. eCollection 2025.

DR-GPT: A large language model for medical report analysis of diabetic retinopathy patients.

PLoS One. 2024 Oct 11;19(10):e0297706. doi: 10.1371/journal.pone.0297706. eCollection 2024.

Utilizing Large Language Models in Ophthalmology: The Current Landscape and Challenges.

Ophthalmol Ther. 2024 Oct;13(10):2543-2558. doi: 10.1007/s40123-024-01018-6. Epub 2024 Aug 24.

Identifying Diabetes Related-Complications in a Real-World Free-Text Electronic Medical Records in Hebrew Using Natural Language Processing Techniques.

J Diabetes Sci Technol. 2024 Jan 30:19322968241228555. doi: 10.1177/19322968241228555.

Exploring large language model for next generation of artificial intelligence in ophthalmology.

Front Med (Lausanne). 2023 Nov 23;10:1291404. doi: 10.3389/fmed.2023.1291404. eCollection 2023.

本文引用的文献

Extracting Family History of Patients From Clinical Narratives: Exploring an End-to-End Solution With Deep Learning Models.

JMIR Med Inform. 2020 Dec 15;8(12):e22982. doi: 10.2196/22982.

Measurement of Semantic Textual Similarity in Clinical Texts: Comparison of Transformer-Based Models.

JMIR Med Inform. 2020 Nov 23;8(11):e19735. doi: 10.2196/19735.

Clinical concept extraction using transformers.

J Am Med Inform Assoc. 2020 Dec 9;27(12):1935-1942. doi: 10.1093/jamia/ocaa189.

BERT-based Ranking for Biomedical Entity Normalization.

AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:269-277. eCollection 2020.

Determination of Marital Status of Patients from Structured and Unstructured Electronic Healthcare Data.

AMIA Annu Symp Proc. 2020 Mar 4;2019:267-274. eCollection 2019.

A study of deep learning methods for de-identification of clinical notes in cross-institute settings.

BMC Med Inform Decis Mak. 2019 Dec 5;19(Suppl 5):232. doi: 10.1186/s12911-019-0935-4.

Cohort selection for clinical trials: n2c2 2018 shared task track 1.

J Am Med Inform Assoc. 2019 Nov 1;26(11):1163-1171. doi: 10.1093/jamia/ocz163.

Diabetic Retinopathy: Pathophysiology and Treatments.

Int J Mol Sci. 2018 Jun 20;19(6):1816. doi: 10.3390/ijms19061816.

iT2DMS: a Standard-Based Diabetic Disease Data Repository and its Pilot Experiment on Diabetic Retinopathy Phenotyping and Examination Results Integration.

J Med Syst. 2018 Jun 6;42(7):131. doi: 10.1007/s10916-018-0939-0.

Clinical Named Entity Recognition Using Deep Learning Models.

AMIA Annu Symp Proc. 2018 Apr 16;2017:1812-1819. eCollection 2017.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

使用基于转换器的自然语言处理方法识别与糖尿病视网膜病变相关的临床概念及其属性。

Identify diabetic retinopathy-related clinical concepts and their attributes using transformer-based natural language processing methods.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译