College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518000, China.
Center for Computer Science and Information Technology, City University of Hong Kong Dongguan Research Institute, Dongguan, China.
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac298.
The associations between biomarkers and human diseases play a key role in understanding complex pathology and developing targeted therapies. Wet lab experiments for biomarker discovery are costly, laborious and time-consuming. Computational prediction methods can be used to greatly expedite the identification of candidate biomarkers.
Here, we present a novel computational model named GTGenie for predicting the biomarker-disease associations based on graph and text features. In GTGenie, a graph attention network is utilized to characterize diverse similarities of biomarkers and diseases from heterogeneous information resources. Meanwhile, a pretrained BERT-based model is applied to learn the text-based representation of biomarker-disease relation from biomedical literature. The captured graph and text features are then integrated in a bimodal fusion network to model the hybrid entity representation. Finally, inductive matrix completion is adopted to infer the missing entries for reconstructing relation matrix, with which the unknown biomarker-disease associations are predicted. Experimental results on HMDD, HMDAD and LncRNADisease data sets showed that GTGenie can obtain competitive prediction performance with other state-of-the-art methods.
The source code of GTGenie and the test data are available at: https://github.com/Wolverinerine/GTGenie.
生物标志物与人类疾病之间的关联在理解复杂病理和开发靶向治疗方法方面起着关键作用。生物标志物发现的湿实验既昂贵又费力且耗时。计算预测方法可用于大大加快候选生物标志物的识别。
在这里,我们提出了一种名为 GTGenie 的新型计算模型,用于基于图和文本特征预测生物标志物-疾病关联。在 GTGenie 中,使用图注意网络从异构信息资源中描述生物标志物和疾病的多种相似性。同时,应用经过预训练的基于 BERT 的模型从生物医学文献中学习生物标志物-疾病关系的基于文本的表示。然后,捕获的图和文本特征在双模融合网络中进行集成,以对混合实体表示进行建模。最后,采用归纳矩阵补全来推断关系矩阵的缺失项,从而预测未知的生物标志物-疾病关联。在 HMDD、HMDAD 和 LncRNADisease 数据集上的实验结果表明,GTGenie 可以与其他最先进的方法获得具有竞争力的预测性能。
GTGenie 的源代码和测试数据可在以下网址获得:https://github.com/Wolverinerine/GTGenie。