文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

Comparative effectiveness of medical concept embedding for feature engineering in phenotyping.

作者信息

Lee Junghwan, Liu Cong, Kim Jae Hyun, Butler Alex, Shang Ning, Pang Chao, Natarajan Karthik, Ryan Patrick, Ta Casey, Weng Chunhua

机构信息

Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York 10032, USA.

出版信息

JAMIA Open. 2021 Jun 16;4(2):ooab028. doi: 10.1093/jamiaopen/ooab028. eCollection 2021 Apr.


DOI:10.1093/jamiaopen/ooab028
PMID:34142015
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8206403/
Abstract

OBJECTIVE: Feature engineering is a major bottleneck in phenotyping. Properly learned medical concept embeddings (MCEs) capture the semantics of medical concepts, thus are useful for retrieving relevant medical features in phenotyping tasks. We compared the effectiveness of MCEs learned from knowledge graphs and electronic healthcare records (EHR) data in retrieving relevant medical features for phenotyping tasks. MATERIALS AND METHODS: We implemented 5 embedding methods including node2vec, singular value decomposition (SVD), LINE, skip-gram, and GloVe with 2 data sources: (1) knowledge graphs obtained from the observational medical outcomes partnership (OMOP) common data model; and (2) patient-level data obtained from the OMOP compatible electronic health records (EHR) from Columbia University Irving Medical Center (CUIMC). We used phenotypes with their relevant concepts developed and validated by the electronic medical records and genomics (eMERGE) network to evaluate the performance of learned MCEs in retrieving phenotype-relevant concepts. in retrieving phenotype-relevant concepts based on a single and multiple seed concept(s) was used to evaluate MCEs. RESULTS: Among all MCEs, MCEs learned by using node2vec with knowledge graphs showed the best performance. Of MCEs based on knowledge graphs and EHR data, MCEs learned by using node2vec with knowledge graphs and MCEs learned by using GloVe with EHR data outperforms other MCEs, respectively. CONCLUSION: MCE enables scalable feature engineering tasks, thereby facilitating phenotyping. Based on current phenotyping practices, MCEs learned by using knowledge graphs constructed by hierarchical relationships among medical concepts outperformed MCEs learned by using EHR data.

摘要
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/039c/8206403/0b308d601760/ooab028f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/039c/8206403/9c226ea6d7a3/ooab028f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/039c/8206403/3b79316ffb12/ooab028f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/039c/8206403/f7f507d865f7/ooab028f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/039c/8206403/c588dddb3e6d/ooab028f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/039c/8206403/200b44d4c2da/ooab028f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/039c/8206403/0b308d601760/ooab028f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/039c/8206403/9c226ea6d7a3/ooab028f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/039c/8206403/3b79316ffb12/ooab028f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/039c/8206403/f7f507d865f7/ooab028f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/039c/8206403/c588dddb3e6d/ooab028f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/039c/8206403/200b44d4c2da/ooab028f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/039c/8206403/0b308d601760/ooab028f6.jpg

相似文献

[1]
Comparative effectiveness of medical concept embedding for feature engineering in phenotyping.

JAMIA Open. 2021-6-16

[2]
ARCH: Large-scale Knowledge Graph via Aggregated Narrative Codified Health Records Analysis.

medRxiv. 2023-5-21

[3]
HPO2Vec+: Leveraging heterogeneous knowledge resources to enrich node embeddings for the Human Phenotype Ontology.

J Biomed Inform. 2019-6-27

[4]
Feature extraction for phenotyping from semantic and knowledge resources.

J Biomed Inform. 2019-2-7

[5]
Multiview Incomplete Knowledge Graph Integration with application to cross-institutional EHR data harmonization.

J Biomed Inform. 2022-9

[6]
ARCH: Large-scale knowledge graph via aggregated narrative codified health records analysis.

J Biomed Inform. 2025-2

[7]
DOME: Directional medical embedding vectors from Electronic Health Records.

J Biomed Inform. 2025-2

[8]
Use of word and graph embedding to measure semantic relatedness between Unified Medical Language System concepts.

J Am Med Inform Assoc. 2020-10-1

[9]
Constructing High-Fidelity Phenotype Knowledge Graphs for Infectious Diseases With a Fine-Grained Semantic Information Model: Development and Usability Study.

J Med Internet Res. 2021-6-15

[10]
EHR phenotyping via jointly embedding medical concepts and words into a unified vector space.

BMC Med Inform Decis Mak. 2018-12-12

引用本文的文献

[1]
Taking a look at your speech: identifying diagnostic status and negative symptoms of psychosis using convolutional neural networks.

NPP Digit Psychiatry Neurosci. 2025

[2]
Applying Natural Language Processing to Textual Data From Clinical Data Warehouses: Systematic Review.

JMIR Med Inform. 2023-12-15

[3]
SymptomGraph: Identifying Symptom Clusters from Narrative Clinical Notes using Graph Clustering.

Proc Symp Appl Comput. 2023-3

[4]
Phenotyping in distributed data networks: selecting the right codes for the right patients.

AMIA Annu Symp Proc. 2022

[5]
FAIRification of health-related data using semantic web technologies in the Swiss Personalized Health Network.

Sci Data. 2023-3-10

[6]
OARD: Open annotations for rare diseases and their phenotypes based on real-world data.

Am J Hum Genet. 2022-9-1

[7]
Phe2vec: Automated disease phenotyping based on unsupervised embeddings from electronic health records.

Patterns (N Y). 2021-9-2

[8]
Severity Prediction for COVID-19 Patients via Recurrent Neural Networks.

AMIA Jt Summits Transl Sci Proc. 2021

[9]
Severity Prediction for COVID-19 Patients via Recurrent Neural Networks.

medRxiv. 2021-1-21

本文引用的文献

[1]
GRAM: Graph-based Attention Model for Healthcare Representation Learning.

KDD. 2017-8

[2]
High-throughput phenotyping with electronic medical record data using a common semi-supervised approach (PheCAP).

Nat Protoc. 2019-11-20

[3]
High-throughput multimodal automated phenotyping (MAP) with application to PheWAS.

J Am Med Inform Assoc. 2019-11-1

[4]
Graph embedding on biomedical networks: methods, applications and evaluations.

Bioinformatics. 2020-2-15

[5]
Making work visible for electronic phenotype implementation: Lessons learned from the eMERGE network.

J Biomed Inform. 2019-9-19

[6]
Detecting Systemic Data Quality Issues in Electronic Health Records.

Stud Health Technol Inform. 2019-8-21

[7]
Facilitating phenotype transfer using a common data model.

J Biomed Inform. 2019-7-17

[8]
HPO2Vec+: Leveraging heterogeneous knowledge resources to enrich node embeddings for the Human Phenotype Ontology.

J Biomed Inform. 2019-6-27

[9]
Advances in Electronic Phenotyping: From Rule-Based Definitions to Machine Learning Models.

Annu Rev Biomed Data Sci. 2018-7

[10]
Columbia Open Health Data, clinical concept prevalence and co-occurrence from electronic health records.

Sci Data. 2018-11-27

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索