Suppr超能文献

基于图注意力网络和文本表示的生物标志物-疾病关联预测。

Prediction of biomarker-disease associations based on graph attention network and text representation.

机构信息

College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518000, China.

Center for Computer Science and Information Technology, City University of Hong Kong Dongguan Research Institute, Dongguan, China.

出版信息

Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac298.

Abstract

MOTIVATION

The associations between biomarkers and human diseases play a key role in understanding complex pathology and developing targeted therapies. Wet lab experiments for biomarker discovery are costly, laborious and time-consuming. Computational prediction methods can be used to greatly expedite the identification of candidate biomarkers.

RESULTS

Here, we present a novel computational model named GTGenie for predicting the biomarker-disease associations based on graph and text features. In GTGenie, a graph attention network is utilized to characterize diverse similarities of biomarkers and diseases from heterogeneous information resources. Meanwhile, a pretrained BERT-based model is applied to learn the text-based representation of biomarker-disease relation from biomedical literature. The captured graph and text features are then integrated in a bimodal fusion network to model the hybrid entity representation. Finally, inductive matrix completion is adopted to infer the missing entries for reconstructing relation matrix, with which the unknown biomarker-disease associations are predicted. Experimental results on HMDD, HMDAD and LncRNADisease data sets showed that GTGenie can obtain competitive prediction performance with other state-of-the-art methods.

AVAILABILITY

The source code of GTGenie and the test data are available at: https://github.com/Wolverinerine/GTGenie.

摘要

动机

生物标志物与人类疾病之间的关联在理解复杂病理和开发靶向治疗方法方面起着关键作用。生物标志物发现的湿实验既昂贵又费力且耗时。计算预测方法可用于大大加快候选生物标志物的识别。

结果

在这里,我们提出了一种名为 GTGenie 的新型计算模型,用于基于图和文本特征预测生物标志物-疾病关联。在 GTGenie 中,使用图注意网络从异构信息资源中描述生物标志物和疾病的多种相似性。同时,应用经过预训练的基于 BERT 的模型从生物医学文献中学习生物标志物-疾病关系的基于文本的表示。然后,捕获的图和文本特征在双模融合网络中进行集成,以对混合实体表示进行建模。最后,采用归纳矩阵补全来推断关系矩阵的缺失项,从而预测未知的生物标志物-疾病关联。在 HMDD、HMDAD 和 LncRNADisease 数据集上的实验结果表明,GTGenie 可以与其他最先进的方法获得具有竞争力的预测性能。

可用性

GTGenie 的源代码和测试数据可在以下网址获得:https://github.com/Wolverinerine/GTGenie。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验