语义谓词的嵌入

Embedding of semantic predications.

作者信息

Cohen Trevor, Widdows Dominic

机构信息

School of Biomedical Informatics, The University of Texas Health Science Center, Houston, TX, United States.

Grab, Inc., Seattle, WA, United States.

出版信息

J Biomed Inform. 2017 Apr;68:150-166. doi: 10.1016/j.jbi.2017.03.003. Epub 2017 Mar 8.

DOI:10.1016/j.jbi.2017.03.003

PMID:28284761

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5441848/

Abstract

This paper concerns the generation of distributed vector representations of biomedical concepts from structured knowledge, in the form of subject-relation-object triplets known as semantic predications. Specifically, we evaluate the extent to which a representational approach we have developed for this purpose previously, known as Predication-based Semantic Indexing (PSI), might benefit from insights gleaned from neural-probabilistic language models, which have enjoyed a surge in popularity in recent years as a means to generate distributed vector representations of terms from free text. To do so, we develop a novel neural-probabilistic approach to encoding predications, called Embedding of Semantic Predications (ESP), by adapting aspects of the Skipgram with Negative Sampling (SGNS) algorithm to this purpose. We compare ESP and PSI across a number of tasks including recovery of encoded information, estimation of semantic similarity and relatedness, and identification of potentially therapeutic and harmful relationships using both analogical retrieval and supervised learning. We find advantages for ESP in some, but not all of these tasks, revealing the contexts in which the additional computational work of neural-probabilistic modeling is justified.

摘要

本文关注从结构化知识中生成生物医学概念的分布式向量表示，这些结构化知识以称为语义谓词的主语 - 关系 - 宾语三元组的形式存在。具体而言，我们评估了一种我们之前为此目的开发的表示方法，即基于谓词的语义索引（PSI），在多大程度上可能受益于从神经概率语言模型中获得的见解，近年来，神经概率语言模型作为一种从自由文本中生成术语分布式向量表示的手段而大受欢迎。为此，我们通过将负采样Skipgram（SGNS）算法的各个方面应用于此目的，开发了一种新颖的神经概率方法来编码谓词，称为语义谓词嵌入（ESP）。我们在多个任务中比较了ESP和PSI，包括编码信息的恢复、语义相似性和相关性的估计，以及使用类比检索和监督学习来识别潜在的治疗和有害关系。我们发现在其中一些但并非所有这些任务中ESP具有优势，揭示了神经概率建模的额外计算工作合理的上下文。

相似文献

Embedding of semantic predications.

J Biomed Inform. 2017 Apr;68:150-166. doi: 10.1016/j.jbi.2017.03.003. Epub 2017 Mar 8.

Predication-based semantic indexing: permutations as a means to encode predications in semantic space.

AMIA Annu Symp Proc. 2009 Nov 14;2009:114-8.

COS: A new MeSH term embedding incorporating corpus, ontology, and semantic predications.

PLoS One. 2021 May 4;16(5):e0251094. doi: 10.1371/journal.pone.0251094. eCollection 2021.

Predicting Adverse Drug-Drug Interactions with Neural Embedding of Semantic Predications.

AMIA Annu Symp Proc. 2020 Mar 4;2019:992-1001. eCollection 2019.

A graph-based recovery and decomposition of Swanson's hypothesis using semantic predications.

J Biomed Inform. 2013 Apr;46(2):238-51. doi: 10.1016/j.jbi.2012.09.004. Epub 2012 Sep 28.

Context-driven automatic subgraph creation for literature-based discovery.

J Biomed Inform. 2015 Apr;54:141-57. doi: 10.1016/j.jbi.2015.01.014. Epub 2015 Feb 7.

Use of word and graph embedding to measure semantic relatedness between Unified Medical Language System concepts.

J Am Med Inform Assoc. 2020 Oct 1;27(10):1538-1546. doi: 10.1093/jamia/ocaa136.

Neural sentence embedding models for semantic similarity estimation in the biomedical domain.

BMC Bioinformatics. 2019 Apr 11;20(1):178. doi: 10.1186/s12859-019-2789-2.

Evaluating semantic relations in neural word embeddings with biomedical and general domain knowledge bases.

BMC Med Inform Decis Mak. 2018 Jul 23;18(Suppl 2):65. doi: 10.1186/s12911-018-0630-x.

Learning predictive models of drug side-effect relationships from distributed representations of literature-derived semantic predications.

J Am Med Inform Assoc. 2018 Oct 1;25(10):1339-1350. doi: 10.1093/jamia/ocy077.

引用本文的文献

Hyperdimensional computing: A fast, robust, and interpretable paradigm for biological data.

PLoS Comput Biol. 2024 Sep 24;20(9):e1012426. doi: 10.1371/journal.pcbi.1012426. eCollection 2024 Sep.

Causal feature selection using a knowledge graph combining structured knowledge from the biomedical literature and ontologies: A use case studying depression as a risk factor for Alzheimer's disease.

J Biomed Inform. 2023 Jun;142:104368. doi: 10.1016/j.jbi.2023.104368. Epub 2023 Apr 21.

Patient Representation Learning From Heterogeneous Data Sources and Knowledge Graphs Using Deep Collective Matrix Factorization: Evaluation Study.

JMIR Med Inform. 2022 Jan 20;10(1):e28842. doi: 10.2196/28842.

Adverse Drug Event Prediction Using Noisy Literature-Derived Knowledge Graphs: Algorithm Development and Validation.

JMIR Med Inform. 2021 Oct 25;9(10):e32730. doi: 10.2196/32730.

Using computable knowledge mined from the literature to elucidate confounders for EHR-based pharmacovigilance.

J Biomed Inform. 2021 May;117:103719. doi: 10.1016/j.jbi.2021.103719. Epub 2021 Mar 11.

Drug repurposing for COVID-19 via knowledge graph completion.

J Biomed Inform. 2021 Mar;115:103696. doi: 10.1016/j.jbi.2021.103696. Epub 2021 Feb 8.

Exploring Novel Computable Knowledge in Structured Drug Product Labels.

AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:403-412. eCollection 2020.

Broad-coverage biomedical relation extraction with SemRep.

BMC Bioinformatics. 2020 May 14;21(1):188. doi: 10.1186/s12859-020-3517-7.

Predicting Adverse Drug-Drug Interactions with Neural Embedding of Semantic Predications.

AMIA Annu Symp Proc. 2020 Mar 4;2019:992-1001. eCollection 2019.

Complementing Observational Signals with Literature-Derived Distributed Representations for Post-Marketing Drug Surveillance.

Drug Saf. 2020 Jan;43(1):67-77. doi: 10.1007/s40264-019-00872-9.

本文引用的文献

Classification-by-Analogy: Using Vector Representations of Implicit Relationships to Identify Plausibly Causal Drug/Side-effect Relationships.

AMIA Annu Symp Proc. 2017 Feb 10;2016:1940-1949. eCollection 2016.

Corpus domain effects on distributional semantic modeling of medical terms.

Bioinformatics. 2016 Dec 1;32(23):3635-3644. doi: 10.1093/bioinformatics/btw529. Epub 2016 Aug 16.

Reasoning with Vectors: A Continuous Model for Fast Robust Inference.

Log J IGPL. 2015 Oct;23(2):141-173. doi: 10.1093/jigpal/jzu028. Epub 2014 Nov 19.

Predicting high-throughput screening results with scalable literature-based discovery methods.

CPT Pharmacometrics Syst Pharmacol. 2014 Oct 8;3(10):e140. doi: 10.1038/psp.2014.37.

Identifying plausible adverse drug reactions using knowledge extracted from the literature.

J Biomed Inform. 2014 Dec;52:293-310. doi: 10.1016/j.jbi.2014.07.011. Epub 2014 Jul 19.

Defining a reference set to support methodological research in drug safety.

Drug Saf. 2013 Oct;36 Suppl 1:S33-47. doi: 10.1007/s40264-013-0097-8.

Representation learning: a review and new perspectives.

IEEE Trans Pattern Anal Mach Intell. 2013 Aug;35(8):1798-828. doi: 10.1109/TPAMI.2013.50.

Representing objects, relations, and sequences.

Neural Comput. 2013 Aug;25(8):2038-78. doi: 10.1162/NECO_a_00467. Epub 2013 Apr 22.

SemMedDB: a PubMed-scale repository of biomedical semantic predications.

Bioinformatics. 2012 Dec 1;28(23):3158-60. doi: 10.1093/bioinformatics/bts591. Epub 2012 Oct 8.

Discovering discovery patterns with Predication-based Semantic Indexing.

J Biomed Inform. 2012 Dec;45(6):1049-65. doi: 10.1016/j.jbi.2012.07.003. Epub 2012 Jul 26.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

语义谓词的嵌入

Embedding of semantic predications.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献