Suppr超能文献

Bioentity2vec:一种用于预测生物实体之间多类型关系的属性和行为驱动的表示方法。

Bioentity2vec: Attribute- and behavior-driven representation for predicting multi-type relationships between bioentities.

机构信息

XinJiang Laboratory of Minority Speech and Language Information Processing, Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, No. 40-1, Beijing South Road, Urumqi, Xinjiang, China.

University of Chinese Academy of Sciences, Beijing 100049, China.

出版信息

Gigascience. 2020 Jun 1;9(6). doi: 10.1093/gigascience/giaa032.

Abstract

BACKGROUND

The explosive growth of genomic, chemical, and pathological data provides new opportunities and challenges for humans to thoroughly understand life activities in cells. However, there exist few computational models that aggregate various bioentities to comprehensively reveal the physical and functional landscape of biological systems.

RESULTS

We constructed a molecular association network, which contains 18 edges (relationships) between 8 nodes (bioentities). Based on this, we propose Bioentity2vec, a new method for representing bioentities, which integrates information about the attributes and behaviors of a bioentity. Applying the random forest classifier, we achieved promising performance on 18 relationships, with an area under the curve of 0.9608 and an area under the precision-recall curve of 0.9572.

CONCLUSIONS

Our study shows that constructing a network with rich topological and biological information is important for systematic understanding of the biological landscape at the molecular level. Our results show that Bioentity2vec can effectively represent biological entities and provides easily distinguishable information about classification tasks. Our method is also able to simultaneously predict relationships between single types and multiple types, which will accelerate progress in biological experimental research and industrial product development.

摘要

背景

基因组学、化学和病理学数据的爆炸式增长为人类彻底了解细胞中的生命活动提供了新的机遇和挑战。然而,目前还很少有计算模型能够综合各种生物实体,全面揭示生物系统的物理和功能景观。

结果

我们构建了一个分子关联网络,其中包含 8 个节点(生物实体)之间的 18 条边(关系)。在此基础上,我们提出了一种新的生物实体表示方法 Bioentity2vec,它整合了生物实体的属性和行为信息。我们应用随机森林分类器在 18 种关系上取得了有希望的性能,曲线下面积为 0.9608,精度-召回曲线下面积为 0.9572。

结论

我们的研究表明,构建一个具有丰富拓扑和生物学信息的网络对于系统地理解分子水平上的生物学景观是重要的。我们的结果表明,Bioentity2vec 可以有效地表示生物实体,并提供易于区分的分类任务信息。我们的方法还能够同时预测单类型和多类型之间的关系,这将加速生物实验研究和工业产品开发的进展。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6292/7293023/20b2f111de98/giaa032fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验