Suppr超能文献

两种互补的 AI 方法用于预测 UMLS 语义组分配:启发式推理和深度学习。

Two complementary AI approaches for predicting UMLS semantic group assignment: heuristic reasoning and deep learning.

机构信息

National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA.

出版信息

J Am Med Inform Assoc. 2023 Nov 17;30(12):1887-1894. doi: 10.1093/jamia/ocad152.

Abstract

OBJECTIVE

Use heuristic, deep learning (DL), and hybrid AI methods to predict semantic group (SG) assignments for new UMLS Metathesaurus atoms, with target accuracy ≥95%.

MATERIALS AND METHODS

We used train-test datasets from successive 2020AA-2022AB UMLS Metathesaurus releases. Our heuristic "waterfall" approach employed a sequence of 7 different SG prediction methods. Atoms not qualifying for a method were passed on to the next method. The DL approach generated BioWordVec and SapBERT embeddings for atom names, BioWordVec embeddings for source vocabulary names, and BioWordVec embeddings for atom names of the second-to-top nodes of an atom's source hierarchy. We fed a concatenation of the 4 embeddings into a fully connected multilayer neural network with an output layer of 15 nodes (one for each SG). For both approaches, we developed methods to estimate the probability that their predicted SG for an atom would be correct. Based on these estimations, we developed 2 hybrid SG prediction methods combining the strengths of heuristic and DL methods.

RESULTS

The heuristic waterfall approach accurately predicted 94.3% of SGs for 1 563 692 new unseen atoms. The DL accuracy on the same dataset was also 94.3%. The hybrid approaches achieved an average accuracy of 96.5%.

CONCLUSION

Our study demonstrated that AI methods can predict SG assignments for new UMLS atoms with sufficient accuracy to be potentially useful as an intermediate step in the time-consuming task of assigning new atoms to UMLS concepts. We showed that for SG prediction, combining heuristic methods and DL methods can produce better results than either alone.

摘要

目的

使用启发式、深度学习 (DL) 和混合人工智能方法来预测新 UMLS 元词表原子的语义组 (SG) 分配,目标准确率≥95%。

材料和方法

我们使用了来自连续 2020AA-2022AB UMLS 元词表发布的训练-测试数据集。我们的启发式“瀑布”方法采用了 7 种不同的 SG 预测方法的序列。不符合方法要求的原子将传递给下一个方法。DL 方法为原子名称生成了 BioWordVec 和 SapBERT 嵌入,为源词汇名称生成了 BioWordVec 嵌入,为原子源层次结构中第二个最高节点的原子名称生成了 BioWordVec 嵌入。我们将 4 个嵌入的串联输入到一个具有 15 个节点(每个 SG 一个)的全连接多层神经网络中。对于这两种方法,我们都开发了一种方法来估计它们对原子的预测 SG 正确的概率。基于这些估计,我们开发了 2 种混合 SG 预测方法,结合了启发式和 DL 方法的优势。

结果

启发式瀑布方法准确预测了 1563692 个新未见原子的 94.3%的 SG。相同数据集上的 DL 准确率也是 94.3%。混合方法的平均准确率达到了 96.5%。

结论

我们的研究表明,人工智能方法可以足够准确地预测新 UMLS 原子的 SG 分配,这对于将新原子分配给 UMLS 概念这一耗时任务来说,可能是一个有用的中间步骤。我们表明,对于 SG 预测,结合启发式方法和 DL 方法可以产生比单独使用任何一种方法更好的结果。

相似文献

5
Biomedical Vocabulary Alignment at Scale in the UMLS Metathesaurus.统一医学语言系统(UMLS)元词表中的大规模生物医学词汇对齐
Proc Int World Wide Web Conf. 2021 Apr;2021:2672-2683. doi: 10.1145/3442381.3450128. Epub 2021 Apr 19.

本文引用的文献

3
Biomedical Vocabulary Alignment at Scale in the UMLS Metathesaurus.统一医学语言系统(UMLS)元词表中的大规模生物医学词汇对齐
Proc Int World Wide Web Conf. 2021 Apr;2021:2672-2683. doi: 10.1145/3442381.3450128. Epub 2021 Apr 19.
5
UMLS users and uses: a current overview.《统一医学语言系统》的用户与用途:当前概述
J Am Med Inform Assoc. 2020 Jul 19;27(10):1606-11. doi: 10.1093/jamia/ocaa084.
9
The Unified Medical Language System.统一医学语言系统
Yearb Med Inform. 1993(1):41-51. doi: 10.1055/s-0038-1637976.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验