Suppr超能文献

LDAEXC:基于深度自动编码器和 XGBoost 分类器的长链非编码 RNA-疾病关联预测。

LDAEXC: LncRNA-Disease Associations Prediction with Deep Autoencoder and XGBoost Classifier.

机构信息

College of Information Science and Engineering, Hunan Normal University, Changsha, China.

出版信息

Interdiscip Sci. 2023 Sep;15(3):439-451. doi: 10.1007/s12539-023-00573-z. Epub 2023 Jun 12.

Abstract

Numerous scientific evidences have revealed that long non-coding RNAs (lncRNAs) are involved in the progression of human complex diseases and biological life activities. Therefore, identifying novel and potential disease-related lncRNAs is helpful to diagnosis, prognosis and therapy of many human complex diseases. Since traditional laboratory experiments are cost and time-consuming, a great quantity of computer algorithms have been proposed for predicting the relationships between lncRNAs and diseases. However, there are still much room for the improvement. In this paper, we introduce an accurate framework named LDAEXC to infer LncRNA-Disease Associations with deep autoencoder and XGBoost Classifier. LDAEXC utilizes different similarity views of lncRNAs and human diseases to construct features for each data sources. Then, the reduced features are obtained by feeding the constructed feature vectors into a deep autoencoder, and at last an XGBoost classifier is leveraged to calculate the latent lncRNA-disease-associated scores using reduced features. The fivefold cross-validation experiments on four datasets showed that LDAEXC reached AUC scores of 0.9676 ± 0.0043, 0.9449 ± 0.022, 0.9375 ± 0.0331 and 0.9556 ± 0.0134, respectively, significantly higher than other advanced similar computer methods. Extensive experiment results and case studies of two complex diseases (colon and breast cancers) further indicated the practicability and excellent prediction performance of LDAEXC in inferring unknown lncRNA-disease associations. TLDAEXC utilizes disease semantic similarity, lncRNA expression similarity, and Gaussian interaction profile kernel similarity of lncRNAs and diseases for feature construction. The constructed features are fed to a deep autoencoder to extract reduced features, and an XGBoost classifier is used to predict the lncRNA-disease associations based on the reduced features. The fivefold and tenfold cross-validation experiments on a benchmark dataset showed that LDAEXC could achieve AUC scores of 0.9676 and 0.9682, respectively, significantly higher than other state-of-the-art similar methods.

摘要

大量科学证据表明,长非编码 RNA(lncRNA)参与了人类复杂疾病和生物生命活动的进展。因此,识别新的潜在疾病相关 lncRNA 有助于许多人类复杂疾病的诊断、预后和治疗。由于传统的实验室实验成本高、耗时,因此已经提出了大量计算机算法来预测 lncRNA 与疾病之间的关系。然而,仍有很大的改进空间。在本文中,我们介绍了一个名为 LDAEXC 的准确框架,该框架使用深度自动编码器和 XGBoost 分类器来推断 lncRNA-疾病关联。LDAEXC 利用 lncRNA 和人类疾病的不同相似视图来为每个数据源构建特征。然后,通过将构建的特征向量输入深度自动编码器来获得减少的特征,最后使用 XGBoost 分类器使用减少的特征来计算潜在的 lncRNA-疾病关联分数。在四个数据集上的五重交叉验证实验表明,LDAEXC 达到了 0.9676 ± 0.0043、0.9449 ± 0.022、0.9375 ± 0.0331 和 0.9556 ± 0.0134 的 AUC 分数,显著高于其他先进的类似计算机方法。对两种复杂疾病(结肠癌和乳腺癌)的广泛实验结果和案例研究进一步表明,LDAEXC 在推断未知 lncRNA-疾病关联方面具有实用性和出色的预测性能。TLDAEXC 利用疾病语义相似性、lncRNA 表达相似性以及 lncRNA 和疾病的高斯互作用分布核相似性进行特征构建。构建的特征被输入深度自动编码器以提取减少的特征,并且基于减少的特征使用 XGBoost 分类器来预测 lncRNA-疾病关联。在基准数据集上的五重和十倍交叉验证实验表明,LDAEXC 可以分别达到 0.9676 和 0.9682 的 AUC 分数,显著高于其他最先进的类似方法。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验