开发和评估新型眼科领域特定的神经词汇向量以预测视觉预后。

Development and evaluation of novel ophthalmology domain-specific neural word embeddings to predict visual prognosis.

机构信息

Byers Eye Institute, Department of Ophthalmology, Stanford University, 2370 Watson Court, Palo Alto, CA, 94303, United States.

Center for Biomedical Informatics Research, School of Medicine, Stanford University, 1265 Welch Road, Stanford, CA, 94305, United States.

出版信息

Int J Med Inform. 2021 Jun;150:104464. doi: 10.1016/j.ijmedinf.2021.104464. Epub 2021 Apr 16.

DOI:10.1016/j.ijmedinf.2021.104464

PMID:33892445

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8183292/

Abstract

OBJECTIVE

To develop and evaluate novel word embeddings (WEs) specific to ophthalmology, using text corpora from published literature and electronic health records (EHR).

MATERIALS AND METHODS

We trained ophthalmology-specific WEs using 121,740 PubMed abstracts and 89,282 EHR notes using word2vec continuous bag-of-words architecture. PubMed and EHR WEs were compared to general domain GloVe WEs and general biomedical domain BioWordVec embeddings using a novel ophthalmology-domain-specific 200-question analogy test and prediction of prognosis in 5547 low vision patients using EHR notes as inputs to a deep learning model.

RESULTS

We found that many words representing important ophthalmic concepts in the EHR were missing from the general domain GloVe vocabulary, but covered in the ophthalmology abstract corpus. On ophthalmology analogy testing, PubMed WEs scored 95.0 %, outperforming EHR (86.0 %) and GloVe (91.0 %) but less than BioWordVec (99.5 %). On predicting low vision prognosis, PubMed and EHR WEs resulted in similar AUROC (0.830; 0.826), outperforming GloVe (0.778) and BioWordVec (0.784).

CONCLUSION

We found that using ophthalmology domain-specific WEs improved performance in ophthalmology-related clinical prediction compared to general WEs. Deep learning models using clinical notes as inputs can predict the prognosis of visually impaired patients. This work provides a framework to improve predictive models using domain-specific WEs.

摘要

目的

使用来自已发表文献和电子健康记录 (EHR) 的文本语料库，开发和评估特定于眼科的新型词嵌入 (WE)。

材料与方法

我们使用 word2vec 连续词袋架构，通过 121740 篇 PubMed 摘要和 89282 篇 EHR 笔记训练眼科特定的 WE。通过一项新颖的眼科特定领域的 200 个问题类比测试和使用 EHR 笔记作为输入的深度学习模型对 5547 名低视力患者预后的预测，将 PubMed 和 EHR WE 与一般领域 GloVe WE 和一般生物医学领域 BioWordVec 嵌入进行比较。

结果

我们发现，EHR 中代表重要眼科概念的许多词在一般领域 GloVe 词汇中缺失，但在眼科摘要语料库中有所涵盖。在眼科类比测试中，PubMed WE 得分为 95.0%，优于 EHR（86.0%）和 GloVe（91.0%），但低于 BioWordVec（99.5%）。在预测低视力预后方面，PubMed 和 EHR WE 的 AUROC 相似（0.830；0.826），优于 GloVe（0.778）和 BioWordVec（0.784）。

结论

我们发现，与一般 WE 相比，使用眼科领域特定的 WE 可提高与眼科相关的临床预测性能。使用临床笔记作为输入的深度学习模型可以预测视力受损患者的预后。这项工作为使用领域特定的 WE 改进预测模型提供了框架。

相似文献

Development and evaluation of novel ophthalmology domain-specific neural word embeddings to predict visual prognosis.

Int J Med Inform. 2021 Jun;150:104464. doi: 10.1016/j.ijmedinf.2021.104464. Epub 2021 Apr 16.

Looking for low vision: Predicting visual prognosis by fusing structured and free-text data from electronic health records.

Int J Med Inform. 2022 Mar;159:104678. doi: 10.1016/j.ijmedinf.2021.104678. Epub 2021 Dec 30.

A comparison of word embeddings for the biomedical natural language processing.

J Biomed Inform. 2018 Nov;87:12-20. doi: 10.1016/j.jbi.2018.09.008. Epub 2018 Sep 12.

Deep Learning Approaches for Predicting Glaucoma Progression Using Electronic Health Records and Natural Language Processing.

Ophthalmol Sci. 2022 Feb 12;2(2):100127. doi: 10.1016/j.xops.2022.100127. eCollection 2022 Jun.

Projection Word Embedding Model With Hybrid Sampling Training for Classifying ICD-10-CM Codes: Longitudinal Observational Study.

JMIR Med Inform. 2019 Jul 23;7(3):e14499. doi: 10.2196/14499.

Fall Risk Prediction in Older Adults Using Free-Text Nursing Notes and Medications in Electronic Health Records.

Annu Int Conf IEEE Eng Med Biol Soc. 2023 Jul;2023:1-4. doi: 10.1109/EMBC40787.2023.10341127.

Word embeddings trained on published case reports are lightweight, effective for clinical tasks, and free of protected health information.

J Biomed Inform. 2022 Jan;125:103971. doi: 10.1016/j.jbi.2021.103971. Epub 2021 Dec 14.

A study of deep learning methods for de-identification of clinical notes in cross-institute settings.

BMC Med Inform Decis Mak. 2019 Dec 5;19(Suppl 5):232. doi: 10.1186/s12911-019-0935-4.

Evaluating semantic relations in neural word embeddings with biomedical and general domain knowledge bases.

BMC Med Inform Decis Mak. 2018 Jul 23;18(Suppl 2):65. doi: 10.1186/s12911-018-0630-x.

EHR-HGCN: An Enhanced Hybrid Approach for Text Classification Using Heterogeneous Graph Convolutional Networks in Electronic Health Records.

IEEE J Biomed Health Inform. 2024 Mar;28(3):1668-1679. doi: 10.1109/JBHI.2023.3346210. Epub 2024 Mar 6.

引用本文的文献

Identifying Transportation Needs in Ophthalmology Clinic Notes Using Natural Language Processing: Retrospective, Cross-Sectional Study.

JMIR Med Inform. 2025 Sep 5;13:e69216. doi: 10.2196/69216.

Visual acuity prediction on real-life patient data using a machine learning based multistage system.

Sci Rep. 2024 Mar 6;14(1):5532. doi: 10.1038/s41598-024-54482-2.

Use of artificial intelligence in forecasting glaucoma progression.

Taiwan J Ophthalmol. 2023 May 23;13(2):168-183. doi: 10.4103/tjo.TJO-D-23-00022. eCollection 2023 Apr-Jun.

Predicting near-term glaucoma progression: An artificial intelligence approach using clinical free-text notes and data from electronic health records.

Front Med (Lausanne). 2023 Apr 13;10:1157016. doi: 10.3389/fmed.2023.1157016. eCollection 2023.

Impact of word embedding models on text analytics in deep learning environment: a review.

Artif Intell Rev. 2023 Feb 22:1-81. doi: 10.1007/s10462-023-10419-1.

Deep Learning Approaches for Predicting Glaucoma Progression Using Electronic Health Records and Natural Language Processing.

Ophthalmol Sci. 2022 Feb 12;2(2):100127. doi: 10.1016/j.xops.2022.100127. eCollection 2022 Jun.

Applications of natural language processing in ophthalmology: present and future.

Front Med (Lausanne). 2022 Aug 8;9:906554. doi: 10.3389/fmed.2022.906554. eCollection 2022.

Predicting Glaucoma Progression Requiring Surgery Using Clinical Free-Text Notes and Transfer Learning With Transformers.

Transl Vis Sci Technol. 2022 Mar 2;11(3):37. doi: 10.1167/tvst.11.3.37.

Looking for low vision: Predicting visual prognosis by fusing structured and free-text data from electronic health records.

Int J Med Inform. 2022 Mar;159:104678. doi: 10.1016/j.ijmedinf.2021.104678. Epub 2021 Dec 30.

本文引用的文献

Estimating Rates of Progression and Predicting Future Visual Fields in Glaucoma Using a Deep Variational Autoencoder.

Sci Rep. 2019 Dec 2;9(1):18113. doi: 10.1038/s41598-019-54653-6.

Automated extraction of ophthalmic surgery outcomes from the electronic health record.

Int J Med Inform. 2020 Jan;133:104007. doi: 10.1016/j.ijmedinf.2019.104007. Epub 2019 Oct 17.

Natural Language Processing Approaches to Detect the Timeline of Metastatic Recurrence of Breast Cancer.

JCO Clin Cancer Inform. 2019 Oct;3:1-12. doi: 10.1200/CCI.19.00034.

BioBERT: a pre-trained biomedical language representation model for biomedical text mining.

Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.

Machine Learning-Based Predictive Modeling of Surgical Intervention in Glaucoma Using Systemic Data From Electronic Health Records.

Am J Ophthalmol. 2019 Dec;208:30-40. doi: 10.1016/j.ajo.2019.07.005. Epub 2019 Jul 16.

BioWordVec, improving biomedical word embeddings with subword information and MeSH.

Sci Data. 2019 May 10;6(1):52. doi: 10.1038/s41597-019-0055-0.

Forecasting future Humphrey Visual Fields using deep learning.

PLoS One. 2019 Apr 5;14(4):e0214875. doi: 10.1371/journal.pone.0214875. eCollection 2019.

Optimizing Corpus Creation for Training Word Embedding in Low Resource Domains: A Case Study in Autism Spectrum Disorder (ASD).

AMIA Annu Symp Proc. 2018 Dec 5;2018:508-517. eCollection 2018.

An Artificial Intelligence Approach to Detect Visual Field Progression in Glaucoma Based on Spatial Pattern Analysis.

Invest Ophthalmol Vis Sci. 2019 Jan 2;60(1):365-375. doi: 10.1167/iovs.18-25568.

Using Kalman Filtering to Forecast Disease Trajectory for Patients With Normal Tension Glaucoma.

Am J Ophthalmol. 2019 Mar;199:111-119. doi: 10.1016/j.ajo.2018.10.012. Epub 2018 Oct 16.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

开发和评估新型眼科领域特定的神经词汇向量以预测视觉预后。

Development and evaluation of novel ophthalmology domain-specific neural word embeddings to predict visual prognosis.

机构信息

出版信息

OBJECTIVE

MATERIALS AND METHODS

RESULTS

CONCLUSION

目的

材料与方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献