Suppr超能文献

基于生物医学文献的无监督事件图表示和相似性学习。

Unsupervised Event Graph Representation and Similarity Learning on Biomedical Literature.

机构信息

Department of Computer Science and Engineering (DISI), University of Bologna, 40126 Bologna, Italy.

Independent Researcher, 48018 Faenza, Italy.

出版信息

Sensors (Basel). 2021 Dec 21;22(1):3. doi: 10.3390/s22010003.

Abstract

The automatic extraction of biomedical events from the scientific literature has drawn keen interest in the last several years, recognizing complex and semantically rich graphical interactions otherwise buried in texts. However, very few works revolve around learning embeddings or similarity metrics for event graphs. This gap leaves biological relations unlinked and prevents the application of machine learning techniques to promote discoveries. Taking advantage of recent deep graph kernel solutions and pre-trained language models, we propose Deep Divergence Event Graph Kernels (DDEGK), an unsupervised inductive method to map events into low-dimensional vectors, preserving their structural and semantic similarities. Unlike most other systems, DDEGK operates at a graph level and does not require task-specific labels, feature engineering, or known correspondences between nodes. To this end, our solution compares events against a small set of anchor ones, trains cross-graph attention networks for drawing pairwise alignments (bolstering interpretability), and employs transformer-based models to encode continuous attributes. Extensive experiments have been done on nine biomedical datasets. We show that our learned event representations can be effectively employed in tasks such as graph classification, clustering, and visualization, also facilitating downstream semantic textual similarity. Empirical results demonstrate that DDEGK significantly outperforms other state-of-the-art methods.

摘要

近年来,从科学文献中自动提取生物医学事件引起了极大的兴趣,因为这些事件中包含着复杂且语义丰富的图形交互,而这些信息通常隐藏在文本中。然而,很少有工作致力于学习事件图的嵌入或相似性度量。这一差距导致生物关系无法关联,也阻碍了机器学习技术在促进发现方面的应用。我们利用最近的深度图核解决方案和预训练的语言模型,提出了深度分歧事件图核(DDEGK),这是一种无监督的归纳方法,可将事件映射到低维向量中,同时保留它们的结构和语义相似性。与大多数其他系统不同,DDEGK 在图级别上运行,不需要特定于任务的标签、特征工程或节点之间的已知对应关系。为此,我们的解决方案将事件与一小部分锚定事件进行比较,为绘制成对对齐(增强可解释性)训练交叉图注意力网络,并采用基于转换器的模型对连续属性进行编码。我们在九个生物医学数据集上进行了广泛的实验。结果表明,我们学习到的事件表示可以有效地用于图分类、聚类和可视化等任务,同时也有助于下游语义文本相似度。实验结果表明,DDEGK 明显优于其他最先进的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/49ab/8747118/2bed3b805b9c/sensors-22-00003-g006.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验