Suppr超能文献

DeepLGP:一种用于优先化 lncRNA 靶基因的新型深度学习方法。

DeepLGP: a novel deep learning method for prioritizing lncRNA target genes.

机构信息

College of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China.

College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China.

出版信息

Bioinformatics. 2020 Aug 15;36(16):4466-4472. doi: 10.1093/bioinformatics/btaa428.

Abstract

MOTIVATION

Although long non-coding RNAs (lncRNAs) have limited capacity for encoding proteins, they have been verified as biomarkers in the occurrence and development of complex diseases. Recent wet-lab experiments have shown that lncRNAs function by regulating the expression of protein-coding genes (PCGs), which could also be the mechanism responsible for causing diseases. Currently, lncRNA-related biological data are increasing rapidly. Whereas, no computational methods have been designed for predicting the novel target genes of lncRNA.

RESULTS

In this study, we present a graph convolutional network (GCN) based method, named DeepLGP, for prioritizing target PCGs of lncRNA. First, gene and lncRNA features were selected, these included their location in the genome, expression in 13 tissues and miRNA-mediated lncRNA-gene pairs. Next, GCN was applied to convolve a gene interaction network for encoding the features of genes and lncRNAs. Then, these features were used by the convolutional neural network for prioritizing target genes of lncRNAs. In 10-cross validations on two independent datasets, DeepLGP obtained high area under curves (0.90-0.98) and area under precision-recall curves (0.91-0.98). We found that lncRNA pairs with high similarity had more overlapped target genes. Further experiments showed that genes targeted by the same lncRNA sets had a strong likelihood of causing the same diseases, which could help in identifying disease-causing PCGs.

AVAILABILITY AND IMPLEMENTATION

https://github.com/zty2009/LncRNA-target-gene.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

尽管长非编码 RNA(lncRNA)的编码蛋白质能力有限,但它们已被验证为复杂疾病发生和发展的生物标志物。最近的湿实验表明,lncRNA 通过调节蛋白编码基因(PCG)的表达发挥作用,这也可能是导致疾病的机制。目前,lncRNA 相关的生物数据正在迅速增加。然而,目前还没有设计用于预测 lncRNA 新靶基因的计算方法。

结果

在这项研究中,我们提出了一种基于图卷积网络(GCN)的方法,称为 DeepLGP,用于优先考虑 lncRNA 的靶标 PCG。首先,选择基因和 lncRNA 特征,包括它们在基因组中的位置、在 13 种组织中的表达以及 miRNA 介导的 lncRNA-基因对。接下来,应用 GCN 卷积基因相互作用网络以编码基因和 lncRNA 的特征。然后,这些特征被卷积神经网络用于优先考虑 lncRNA 的靶基因。在两个独立数据集的 10 次交叉验证中,DeepLGP 获得了高的曲线下面积(0.90-0.98)和精度-召回曲线下面积(0.91-0.98)。我们发现具有高相似度的 lncRNA 对具有更多重叠的靶基因。进一步的实验表明,同一 lncRNA 集靶向的基因具有引起相同疾病的强烈可能性,这有助于识别致病 PCG。

可用性和实现

https://github.com/zty2009/LncRNA-target-gene。

补充信息

补充数据可在 Bioinformatics 在线获取。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验