Suppr超能文献

利用卷积神经网络和循环神经网络的序列及本体表示改进环状RNA与疾病关联预测

Improving circRNA-disease association prediction by sequence and ontology representations with convolutional and recurrent neural networks.

作者信息

Lu Chengqian, Zeng Min, Wu Fang-Xiang, Li Min, Wang Jianxin

机构信息

School of Computer Science and Engineering, Central South University, Changsha 410083, P.R. China.

Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha 410083, P.R. China.

出版信息

Bioinformatics. 2021 Apr 5;36(24):5656-5664. doi: 10.1093/bioinformatics/btaa1077.

Abstract

MOTIVATION

Emerging studies indicate that circular RNAs (circRNAs) are widely involved in the progression of human diseases. Due to its special structure which is stable, circRNAs are promising diagnostic and prognostic biomarkers for diseases. However, the experimental verification of circRNA-disease associations is expensive and limited to small-scale. Effective computational methods for predicting potential circRNA-disease associations are regarded as a matter of urgency. Although several models have been proposed, over-reliance on known associations and the absence of characteristics of biological functions make precise predictions are still challenging.

RESULTS

In this study, we propose a method for predicting CircRNA-disease associations based on sequence and ontology representations, named CDASOR, with convolutional and recurrent neural networks. For sequences of circRNAs, we encode them with continuous k-mers, get low-dimensional vectors of k-mers, extract their local feature vectors with 1D CNN and learn their long-term dependencies with bi-directional long short-term memory. For diseases, we serialize disease ontology into sentences containing the hierarchy of ontology, obtain low-dimensional vectors for disease ontology terms and get terms' dependencies. Furthermore, we get association patterns of circRNAs and diseases from known circRNA-disease associations with neural networks. After the above steps, we get circRNAs' and diseases' high-level representations, which are informative to improve the prediction. The experimental results show that CDASOR provides an accurate prediction. Importing the characteristics of biological functions, CDASOR achieves impressive predictions in the de novo test. In addition, 6 of the top-10 predicted results are verified by the published literature in the case studies.

AVAILABILITY AND IMPLEMENTATION

The code and data of CDASOR are freely available at https://github.com/BioinformaticsCSU/CDASOR.

摘要

动机

新兴研究表明,环状RNA(circRNA)广泛参与人类疾病的进展。由于其特殊的稳定结构,circRNA有望成为疾病的诊断和预后生物标志物。然而,circRNA与疾病关联的实验验证成本高昂且仅限于小规模研究。开发有效的计算方法来预测潜在的circRNA与疾病的关联迫在眉睫。尽管已经提出了几种模型,但过度依赖已知关联以及缺乏生物学功能特征使得精确预测仍然具有挑战性。

结果

在本研究中,我们提出了一种基于序列和本体表示的预测circRNA与疾病关联的方法,名为CDASOR,该方法使用了卷积神经网络和循环神经网络。对于circRNA序列,我们用连续的k-mer对其进行编码,得到k-mer的低维向量,用一维卷积神经网络提取其局部特征向量,并使用双向长短期记忆学习其长期依赖性。对于疾病,我们将疾病本体序列化为包含本体层次结构的句子,获得疾病本体术语的低维向量并得到术语之间的依赖性。此外,我们通过神经网络从已知的circRNA与疾病关联中获取circRNA和疾病的关联模式。经过上述步骤,我们得到了circRNA和疾病的高级表示,这有助于提高预测的准确性。实验结果表明,CDASOR提供了准确的预测。引入生物学功能特征后,CDASOR在从头测试中取得了令人印象深刻的预测结果。此外,在案例研究中,前10个预测结果中有6个得到了已发表文献的验证。

可用性和实现方式

CDASOR的代码和数据可在https://github.com/BioinformaticsCSU/CDASOR上免费获取。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验