Suppr超能文献

从多个基于 EMR 的语义嵌入中提取相似术语,以支持图表审查。

Extracting similar terms from multiple EMR-based semantic embeddings to support chart reviews.

机构信息

Department of Computer Science, Vanderbilt University, Nashville, TN, USA.

Department of Computer Science, Vanderbilt University, Nashville, TN, USA; Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA.

出版信息

J Biomed Inform. 2018 Jul;83:63-72. doi: 10.1016/j.jbi.2018.05.014. Epub 2018 May 22.

Abstract

OBJECTIVE

Word embeddings project semantically similar terms into nearby points in a vector space. When trained on clinical text, these embeddings can be leveraged to improve keyword search and text highlighting. In this paper, we present methods to refine the selection process of similar terms from multiple EMR-based word embeddings, and evaluate their performance quantitatively and qualitatively across multiple chart review tasks.

MATERIALS AND METHODS

Word embeddings were trained on each clinical note type in an EMR. These embeddings were then combined, weighted, and truncated to select a refined set of similar terms to be used in keyword search and text highlighting. To evaluate their quality, we measured the similar terms' information retrieval (IR) performance using precision-at-K (P@5, P@10). Additionally a user study evaluated users' search term preferences, while a timing study measured the time to answer a question from a clinical chart.

RESULTS

The refined terms outperformed the baseline method's information retrieval performance (e.g., increasing the average P@5 from 0.48 to 0.60). Additionally, the refined terms were preferred by most users, and reduced the average time to answer a question.

CONCLUSIONS

Clinical information can be more quickly retrieved and synthesized when using semantically similar term from multiple embeddings.

摘要

目的:词向量将语义相似的术语映射到向量空间中的邻近点。当在临床文本上进行训练时,这些嵌入可以被利用来改进关键词搜索和文本突出显示。在本文中,我们提出了从多个基于 EMR 的词嵌入中精炼相似术语选择过程的方法,并在多个图表审查任务中对其进行了定量和定性评估。

材料和方法:在 EMR 中的每种临床记录类型上训练词嵌入。然后将这些嵌入组合、加权和截断,以选择一组经过精炼的相似术语,用于关键词搜索和文本突出显示。为了评估它们的质量,我们使用精度-at-K(P@5、P@10)来衡量相似术语的信息检索(IR)性能。此外,用户研究评估了用户的搜索词偏好,而时间研究则衡量了从临床图表回答问题的时间。

结果:精炼后的术语提高了信息检索性能(例如,平均 P@5 从 0.48 提高到 0.60)。此外,大多数用户更喜欢精炼后的术语,并减少了回答问题的平均时间。

结论:当使用来自多个嵌入的语义相似术语时,可以更快地检索和综合临床信息。

相似文献

引用本文的文献

1
Clinical Information Retrieval: A Literature Review.临床信息检索:文献综述
J Healthc Inform Res. 2024 Jan 23;8(2):313-352. doi: 10.1007/s41666-024-00159-4. eCollection 2024 Jun.
3
Development of a Lexicon for Pain.疼痛词汇表的编制
Front Digit Health. 2021 Dec 13;3:778305. doi: 10.3389/fdgth.2021.778305. eCollection 2021.

本文引用的文献

2
Discovering Related Clinical Concepts Using Large Amounts of Clinical Notes.利用大量临床记录发现相关临床概念
Biomed Eng Comput Biol. 2016 Sep 7;7(Suppl 2):27-33. doi: 10.4137/BECB.S36155. eCollection 2016.
3
Corpus domain effects on distributional semantic modeling of medical terms.语料库领域对医学术语分布语义建模的影响。
Bioinformatics. 2016 Dec 1;32(23):3635-3644. doi: 10.1093/bioinformatics/btw529. Epub 2016 Aug 16.
8
The electronic health record for translational research.用于转化研究的电子健康记录。
J Cardiovasc Transl Res. 2014 Aug;7(6):607-14. doi: 10.1007/s12265-014-9579-z. Epub 2014 Jul 29.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验