从电子病历中学习健康知识图谱。

Learning a Health Knowledge Graph from Electronic Medical Records.

机构信息

Center for Data Science, New York University, New York, NY, USA.

Department of Computer Science, New York University, New York, NY, USA.

出版信息

Sci Rep. 2017 Jul 20;7(1):5994. doi: 10.1038/s41598-017-05778-z.

DOI:10.1038/s41598-017-05778-z

PMID:28729710

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5519723/

Abstract

Demand for clinical decision support systems in medicine and self-diagnostic symptom checkers has substantially increased in recent years. Existing platforms rely on knowledge bases manually compiled through a labor-intensive process or automatically derived using simple pairwise statistics. This study explored an automated process to learn high quality knowledge bases linking diseases and symptoms directly from electronic medical records. Medical concepts were extracted from 273,174 de-identified patient records and maximum likelihood estimation of three probabilistic models was used to automatically construct knowledge graphs: logistic regression, naive Bayes classifier and a Bayesian network using noisy OR gates. A graph of disease-symptom relationships was elicited from the learned parameters and the constructed knowledge graphs were evaluated and validated, with permission, against Google's manually-constructed knowledge graph and against expert physician opinions. Our study shows that direct and automated construction of high quality health knowledge graphs from medical records using rudimentary concept extraction is feasible. The noisy OR model produces a high quality knowledge graph reaching precision of 0.85 for a recall of 0.6 in the clinical evaluation. Noisy OR significantly outperforms all tested models across evaluation frameworks (p < 0.01).

摘要

近年来，医学领域对临床决策支持系统和自我诊断症状检查器的需求大幅增加。现有的平台依赖于通过劳动密集型过程手动编制的知识库，或者使用简单的两两统计数据自动推导。本研究探索了一种从电子病历中直接学习将疾病与症状相关联的高质量知识库的自动化方法。从 273174 份去标识患者记录中提取了医学概念，并使用最大似然估计对三个概率模型进行了自动构建：逻辑回归、朴素贝叶斯分类器和使用噪声或门的贝叶斯网络。从学习到的参数中引出了疾病-症状关系图，并在获得许可的情况下，根据谷歌的手动构建知识图和专家医生的意见对构建的知识库进行了评估和验证。我们的研究表明，使用基本的概念提取，直接从病历中自动构建高质量的健康知识库是可行的。噪声或模型生成了一个高质量的知识库，在临床评估中，召回率为 0.6 时的精度达到 0.85。在所有评估框架中（p < 0.01），噪声或模型的表现均显著优于所有测试模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd26/5519723/145111d201f3/41598_2017_5778_Fig1_HTML.jpg

相似文献

Learning a Health Knowledge Graph from Electronic Medical Records.

Sci Rep. 2017 Jul 20;7(1):5994. doi: 10.1038/s41598-017-05778-z.

CBN: Constructing a clinical Bayesian network based on data from the electronic medical record.

J Biomed Inform. 2018 Dec;88:1-10. doi: 10.1016/j.jbi.2018.10.007. Epub 2018 Nov 3.

Towards electronic health record-based medical knowledge graph construction, completion, and applications: A literature study.

J Biomed Inform. 2023 Jul;143:104403. doi: 10.1016/j.jbi.2023.104403. Epub 2023 May 24.

Development of a Knowledge Graph Embeddings Model for Pain.

AMIA Annu Symp Proc. 2024 Jan 11;2023:299-308. eCollection 2023.

A study of EMR-based medical knowledge network and its applications.

Comput Methods Programs Biomed. 2017 May;143:13-23. doi: 10.1016/j.cmpb.2017.02.016. Epub 2017 Feb 23.

Construction of a knowledge graph for diabetes complications from expert-reviewed clinical evidences.

Comput Assist Surg (Abingdon). 2020 Nov 16;25(1):29-35. doi: 10.1080/24699322.2020.1850866.

Construction of a knowledge graph for breast cancer diagnosis based on Chinese electronic medical records: development and usability study.

BMC Med Inform Decis Mak. 2023 Oct 10;23(1):210. doi: 10.1186/s12911-023-02322-0.

Clinical Assistant Diagnosis for Electronic Medical Record Based on Convolutional Neural Network.

Sci Rep. 2018 Apr 20;8(1):6329. doi: 10.1038/s41598-018-24389-w.

Learning an expandable EMR-based medical knowledge network to enhance clinical diagnosis.

Artif Intell Med. 2020 Jul;107:101927. doi: 10.1016/j.artmed.2020.101927. Epub 2020 Jul 3.

On the interplay of machine learning and background knowledge in image interpretation by Bayesian networks.

Artif Intell Med. 2013 Jan;57(1):73-86. doi: 10.1016/j.artmed.2012.12.004. Epub 2013 Feb 7.

引用本文的文献

Constructing public health evidence knowledge graph for decision-making support from COVID-19 literature of modelling study.

J Saf Sci Resil. 2021 Sep;2(3):146-156. doi: 10.1016/j.jnlssr.2021.08.002. Epub 2021 Aug 13.

The analysis of artificial intelligence knowledge graphs for online music learning platform under deep learning.

Sci Rep. 2025 May 12;15(1):16481. doi: 10.1038/s41598-025-01810-9.

Detecting emergencies in patient portal messages using large language models and knowledge graph-based retrieval-augmented generation.

J Am Med Inform Assoc. 2025 Jun 1;32(6):1032-1039. doi: 10.1093/jamia/ocaf059.

Leveraging Medical Knowledge Graphs Into Large Language Models for Diagnosis Prediction: Design and Application Study.

JMIR AI. 2025 Feb 24;4:e58670. doi: 10.2196/58670.

Connecting electronic health records to a biomedical knowledge graph to link clinical phenotypes and molecular endotypes in atopic dermatitis.

Sci Rep. 2025 Jan 24;15(1):3082. doi: 10.1038/s41598-024-78794-5.

Medical large language models are vulnerable to data-poisoning attacks.

Nat Med. 2025 Feb;31(2):618-626. doi: 10.1038/s41591-024-03445-1. Epub 2025 Jan 8.

Unified Clinical Vocabulary Embeddings for Advancing Precision Medicine.

medRxiv. 2024 Dec 10:2024.12.03.24318322. doi: 10.1101/2024.12.03.24318322.

Generating Biomedical Knowledge Graphs from Knowledge Bases, Registries, and Multiomic Data.

bioRxiv. 2024 Nov 15:2024.11.14.623648. doi: 10.1101/2024.11.14.623648.

Patient-centric knowledge graphs: a survey of current methods, challenges, and applications.

Front Artif Intell. 2024 Oct 23;7:1388479. doi: 10.3389/frai.2024.1388479. eCollection 2024.

Multisource representation learning for pediatric knowledge extraction from electronic health records.

NPJ Digit Med. 2024 Nov 13;7(1):319. doi: 10.1038/s41746-024-01320-4.

本文引用的文献

Population-Level Prediction of Type 2 Diabetes From Claims Data and Analysis of Risk Factors.

Big Data. 2015 Dec;3(4):277-87. doi: 10.1089/big.2015.0020.

Screening for Pancreatic Adenocarcinoma Using Signals From Web Search Logs: Feasibility Study and Results.

J Oncol Pract. 2016 Aug;12(8):737-44. doi: 10.1200/JOP.2015.010504. Epub 2016 Jun 7.

Extracting information from the text of electronic medical records to improve case detection: a systematic review.

J Am Med Inform Assoc. 2016 Sep;23(5):1007-15. doi: 10.1093/jamia/ocv180. Epub 2016 Feb 5.

Building the graph of medicine from millions of clinical narratives.

Sci Data. 2014 Sep 16;1:140032. doi: 10.1038/sdata.2014.32. eCollection 2014.

Accuracy of a computer-based diagnostic program for ambulatory patients with knee pain.

Am J Sports Med. 2014 Oct;42(10):2371-6. doi: 10.1177/0363546514541654. Epub 2014 Jul 29.

Sick patients have more data: the non-random completeness of electronic health records.

AMIA Annu Symp Proc. 2013 Nov 16;2013:1472-7. eCollection 2013.

Giving patients choice and control: health informatics on the patient journey.

Yearb Med Inform. 2012;7:70-3.

The information-seeking behavior of clinical staff in a large health care organization.

J Med Libr Assoc. 2009 Jan;97(1):47-50. doi: 10.3163/1536-5050.97.1.009.

Probabilistic asthma case finding: a noisy or reformulation.

AMIA Annu Symp Proc. 2008 Nov 6;2008:6-10.

Automated de-identification of free-text medical records.

BMC Med Inform Decis Mak. 2008 Jul 24;8:32. doi: 10.1186/1472-6947-8-32.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

从电子病历中学习健康知识图谱。

Learning a Health Knowledge Graph from Electronic Medical Records.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献