Suppr超能文献

一种基于学习的lncRNA-疾病关联识别方法:结合相似性信息与旋转森林

A Learning-Based Method for LncRNA-Disease Association Identification Combing Similarity Information and Rotation Forest.

作者信息

Guo Zhen-Hao, You Zhu-Hong, Wang Yan-Bin, Yi Hai-Cheng, Chen Zhan-Heng

机构信息

Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China; University of Chinese Academy of Sciences, Beijing 100049, China.

Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.

出版信息

iScience. 2019 Sep 27;19:786-795. doi: 10.1016/j.isci.2019.08.030. Epub 2019 Aug 23.

Abstract

Long non-coding RNA (lncRNA) play critical roles in the occurrence and development of various diseases. The determination of the lncRNA-disease associations thus would contribute to provide new insights into the pathogenesis of the disease, the diagnosis, and the gene treatments. Considering that traditional experimental approaches are difficult to detect potential human lncRNA-disease associations from the vast amount of biological data, developing computational method could be of significant value. In this paper, we proposed a novel computational method named LDASR to identify associations between lncRNA and disease by analyzing known lncRNA-disease associations. First, the feature vectors of the lncRNA-disease pairs were obtained by integrating lncRNA Gaussian interaction profile kernel similarity, disease semantic similarity, and Gaussian interaction profile kernel similarity. Second, autoencoder neural network was employed to reduce the feature dimension and get the optimal feature subspace from the original feature set. Finally, Rotating Forest was used to carry out prediction of lncRNA-disease association. The proposed method achieves an excellent preference with 0.9502 AUC in leave-one-out cross-validations (LOOCV) and 0.9428 AUC in 5-fold cross-validation, which significantly outperformed previous methods. Moreover, two kinds of case studies on identifying lncRNAs associated with colorectal cancer and glioma further proves the capability of LDASR in identifying novel lncRNA-disease associations. The promising experimental results show that the LDASR can be an excellent addition to the biomedical research in the future.

摘要

长链非编码RNA(lncRNA)在多种疾病的发生和发展中发挥着关键作用。因此,确定lncRNA与疾病之间的关联将有助于为疾病的发病机制、诊断和基因治疗提供新的见解。鉴于传统实验方法难以从海量生物数据中检测潜在的人类lncRNA与疾病的关联,开发计算方法可能具有重要价值。在本文中,我们提出了一种名为LDASR的新型计算方法,通过分析已知的lncRNA与疾病的关联来识别lncRNA与疾病之间的关联。首先,通过整合lncRNA高斯相互作用谱核相似性、疾病语义相似性和高斯相互作用谱核相似性,获得lncRNA与疾病对的特征向量。其次,采用自动编码器神经网络来降低特征维度,并从原始特征集中获得最优特征子空间。最后,使用旋转森林进行lncRNA与疾病关联的预测。所提出的方法在留一法交叉验证(LOOCV)中AUC为0.9502,在五折交叉验证中AUC为0.9428,取得了优异的性能,显著优于先前的方法。此外,关于识别与结直肠癌和神经胶质瘤相关的lncRNA的两种案例研究进一步证明了LDASR在识别新型lncRNA与疾病关联方面的能力。这些有前景的实验结果表明,LDASR在未来可能成为生物医学研究的优秀补充。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d21/6733997/b6446914affe/fx1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验