Suppr超能文献

从电子病历中学习健康知识图谱。

Learning a Health Knowledge Graph from Electronic Medical Records.

机构信息

Center for Data Science, New York University, New York, NY, USA.

Department of Computer Science, New York University, New York, NY, USA.

出版信息

Sci Rep. 2017 Jul 20;7(1):5994. doi: 10.1038/s41598-017-05778-z.

Abstract

Demand for clinical decision support systems in medicine and self-diagnostic symptom checkers has substantially increased in recent years. Existing platforms rely on knowledge bases manually compiled through a labor-intensive process or automatically derived using simple pairwise statistics. This study explored an automated process to learn high quality knowledge bases linking diseases and symptoms directly from electronic medical records. Medical concepts were extracted from 273,174 de-identified patient records and maximum likelihood estimation of three probabilistic models was used to automatically construct knowledge graphs: logistic regression, naive Bayes classifier and a Bayesian network using noisy OR gates. A graph of disease-symptom relationships was elicited from the learned parameters and the constructed knowledge graphs were evaluated and validated, with permission, against Google's manually-constructed knowledge graph and against expert physician opinions. Our study shows that direct and automated construction of high quality health knowledge graphs from medical records using rudimentary concept extraction is feasible. The noisy OR model produces a high quality knowledge graph reaching precision of 0.85 for a recall of 0.6 in the clinical evaluation. Noisy OR significantly outperforms all tested models across evaluation frameworks (p < 0.01).

摘要

近年来,医学领域对临床决策支持系统和自我诊断症状检查器的需求大幅增加。现有的平台依赖于通过劳动密集型过程手动编制的知识库,或者使用简单的两两统计数据自动推导。本研究探索了一种从电子病历中直接学习将疾病与症状相关联的高质量知识库的自动化方法。从 273174 份去标识患者记录中提取了医学概念,并使用最大似然估计对三个概率模型进行了自动构建:逻辑回归、朴素贝叶斯分类器和使用噪声或门的贝叶斯网络。从学习到的参数中引出了疾病-症状关系图,并在获得许可的情况下,根据谷歌的手动构建知识图和专家医生的意见对构建的知识库进行了评估和验证。我们的研究表明,使用基本的概念提取,直接从病历中自动构建高质量的健康知识库是可行的。噪声或模型生成了一个高质量的知识库,在临床评估中,召回率为 0.6 时的精度达到 0.85。在所有评估框架中(p < 0.01),噪声或模型的表现均显著优于所有测试模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd26/5519723/145111d201f3/41598_2017_5778_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验