LDAEXC：基于深度自动编码器和 XGBoost 分类器的长链非编码 RNA-疾病关联预测。

LDAEXC: LncRNA-Disease Associations Prediction with Deep Autoencoder and XGBoost Classifier.

机构信息

College of Information Science and Engineering, Hunan Normal University, Changsha, China.

出版信息

Interdiscip Sci. 2023 Sep;15(3):439-451. doi: 10.1007/s12539-023-00573-z. Epub 2023 Jun 12.

Abstract

Numerous scientific evidences have revealed that long non-coding RNAs (lncRNAs) are involved in the progression of human complex diseases and biological life activities. Therefore, identifying novel and potential disease-related lncRNAs is helpful to diagnosis, prognosis and therapy of many human complex diseases. Since traditional laboratory experiments are cost and time-consuming, a great quantity of computer algorithms have been proposed for predicting the relationships between lncRNAs and diseases. However, there are still much room for the improvement. In this paper, we introduce an accurate framework named LDAEXC to infer LncRNA-Disease Associations with deep autoencoder and XGBoost Classifier. LDAEXC utilizes different similarity views of lncRNAs and human diseases to construct features for each data sources. Then, the reduced features are obtained by feeding the constructed feature vectors into a deep autoencoder, and at last an XGBoost classifier is leveraged to calculate the latent lncRNA-disease-associated scores using reduced features. The fivefold cross-validation experiments on four datasets showed that LDAEXC reached AUC scores of 0.9676 ± 0.0043, 0.9449 ± 0.022, 0.9375 ± 0.0331 and 0.9556 ± 0.0134, respectively, significantly higher than other advanced similar computer methods. Extensive experiment results and case studies of two complex diseases (colon and breast cancers) further indicated the practicability and excellent prediction performance of LDAEXC in inferring unknown lncRNA-disease associations. TLDAEXC utilizes disease semantic similarity, lncRNA expression similarity, and Gaussian interaction profile kernel similarity of lncRNAs and diseases for feature construction. The constructed features are fed to a deep autoencoder to extract reduced features, and an XGBoost classifier is used to predict the lncRNA-disease associations based on the reduced features. The fivefold and tenfold cross-validation experiments on a benchmark dataset showed that LDAEXC could achieve AUC scores of 0.9676 and 0.9682, respectively, significantly higher than other state-of-the-art similar methods.

摘要

大量科学证据表明，长非编码 RNA（lncRNA）参与了人类复杂疾病和生物生命活动的进展。因此，识别新的潜在疾病相关 lncRNA 有助于许多人类复杂疾病的诊断、预后和治疗。由于传统的实验室实验成本高、耗时，因此已经提出了大量计算机算法来预测 lncRNA 与疾病之间的关系。然而，仍有很大的改进空间。在本文中，我们介绍了一个名为 LDAEXC 的准确框架，该框架使用深度自动编码器和 XGBoost 分类器来推断 lncRNA-疾病关联。LDAEXC 利用 lncRNA 和人类疾病的不同相似视图来为每个数据源构建特征。然后，通过将构建的特征向量输入深度自动编码器来获得减少的特征，最后使用 XGBoost 分类器使用减少的特征来计算潜在的 lncRNA-疾病关联分数。在四个数据集上的五重交叉验证实验表明，LDAEXC 达到了 0.9676 ± 0.0043、0.9449 ± 0.022、0.9375 ± 0.0331 和 0.9556 ± 0.0134 的 AUC 分数，显著高于其他先进的类似计算机方法。对两种复杂疾病（结肠癌和乳腺癌）的广泛实验结果和案例研究进一步表明，LDAEXC 在推断未知 lncRNA-疾病关联方面具有实用性和出色的预测性能。TLDAEXC 利用疾病语义相似性、lncRNA 表达相似性以及 lncRNA 和疾病的高斯互作用分布核相似性进行特征构建。构建的特征被输入深度自动编码器以提取减少的特征，并且基于减少的特征使用 XGBoost 分类器来预测 lncRNA-疾病关联。在基准数据集上的五重和十倍交叉验证实验表明，LDAEXC 可以分别达到 0.9676 和 0.9682 的 AUC 分数，显著高于其他最先进的类似方法。

相似文献

LDAEXC: LncRNA-Disease Associations Prediction with Deep Autoencoder and XGBoost Classifier.LDAEXC：基于深度自动编码器和 XGBoost 分类器的长链非编码 RNA-疾病关联预测。

Interdiscip Sci. 2023 Sep;15(3):439-451. doi: 10.1007/s12539-023-00573-z. Epub 2023 Jun 12.

LDNFSGB: prediction of long non-coding rna and disease association using network feature similarity and gradient boosting.LDNFSGB：基于网络特征相似性和梯度提升的长非编码 RNA 与疾病关联预测

BMC Bioinformatics. 2020 Sep 3;21(1):377. doi: 10.1186/s12859-020-03721-0.

A novel computational model for predicting potential LncRNA-disease associations based on both direct and indirect features of LncRNA-disease pairs.基于 LncRNA-疾病对的直接和间接特征预测潜在 LncRNA-疾病关联的新型计算模型。

BMC Bioinformatics. 2020 Dec 2;21(1):555. doi: 10.1186/s12859-020-03906-7.

LDAGM: prediction lncRNA-disease asociations by graph convolutional auto-encoder and multilayer perceptron based on multi-view heterogeneous networks.LDAGM：基于多视图异质网络的图卷积自动编码器和多层感知机预测 lncRNA-疾病关联。

BMC Bioinformatics. 2024 Oct 15;25(1):332. doi: 10.1186/s12859-024-05950-z.

CNNDLP: A Method Based on Convolutional Autoencoder and Convolutional Neural Network with Adjacent Edge Attention for Predicting lncRNA-Disease Associations.CNNDLP：一种基于卷积自动编码器和卷积神经网络的方法，具有相邻边缘注意力，用于预测 lncRNA-疾病关联。

Int J Mol Sci. 2019 Aug 30;20(17):4260. doi: 10.3390/ijms20174260.

Prediction of lncRNA and disease associations based on residual graph convolutional networks with attention mechanism.基于带有注意力机制的残差图卷积网络的长链非编码RNA与疾病关联预测

Sci Rep. 2024 Mar 2;14(1):5185. doi: 10.1038/s41598-024-55957-y.

DHNLDA: A Novel Deep Hierarchical Network Based Method for Predicting lncRNA-Disease Associations.DHNLDA：一种基于深度层次网络的lncRNA-疾病关联预测新方法。

IEEE/ACM Trans Comput Biol Bioinform. 2022 Nov-Dec;19(6):3395-3403. doi: 10.1109/TCBB.2021.3113326. Epub 2022 Dec 8.

Predicting lncRNA-disease associations using network topological similarity based on deep mining heterogeneous networks.基于深度挖掘异质网络的网络拓扑相似性预测 lncRNA-疾病关联。

Math Biosci. 2019 Sep;315:108229. doi: 10.1016/j.mbs.2019.108229. Epub 2019 Jul 16.

LDAI-ISPS: LncRNA-Disease Associations Inference Based on Integrated Space Projection Scores.LDAI-ISPS：基于综合空间投影得分的 lncRNA-疾病关联推断。

Int J Mol Sci. 2020 Feb 22;21(4):1508. doi: 10.3390/ijms21041508.

Attentional multi-level representation encoding based on convolutional and variance autoencoders for lncRNA-disease association prediction.基于卷积和方差自动编码器的注意力多层次表示编码在 lncRNA-疾病关联预测中的应用。

Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa067.

引用本文的文献

Decoding potential lncRNA and disease associations through graph representation learning and gradient boosting with histogram.通过基于直方图的图表示学习和梯度提升来解码潜在的长链非编码RNA与疾病的关联。

Sci Rep. 2025 Aug 26;15(1):31407. doi: 10.1038/s41598-025-16177-0.

LDA-SCGB: inferring lncRNA-disease associations based on condensed gradient boosting.LDA-SCGB：基于凝聚梯度提升推断长链非编码RNA与疾病的关联

BMC Bioinformatics. 2025 Jul 22;26(1):190. doi: 10.1186/s12859-025-06169-2.

HGCMLDA: predicting lncRNA-disease associations using hypergraph contrastive learning and multi-scale attentional feature fusion.HGCMLDA：使用超图对比学习和多尺度注意力特征融合预测长链非编码RNA与疾病的关联

Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf262.

Predicting noncoding RNA and disease associations using multigraph contrastive learning.使用多重图对比学习预测非编码RNA与疾病的关联

Sci Rep. 2025 Jan 2;15(1):230. doi: 10.1038/s41598-024-81862-5.

MORE: a multi-omics data-driven hypergraph integration network for biomedical data classification and biomarker identification.MORE：一种用于生物医学数据分类和生物标志物识别的多组学数据驱动的超图整合网络。

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae658.

BMC Bioinformatics. 2024 Oct 15;25(1):332. doi: 10.1186/s12859-024-05950-z.

A stacked machine learning-based classification model for endometriosis and adenomyosis: a retrospective cohort study utilizing peripheral blood and coagulation markers.一种基于堆叠机器学习的子宫内膜异位症和子宫腺肌病分类模型：一项利用外周血和凝血标志物的回顾性队列研究。

Front Digit Health. 2024 Sep 10;6:1463419. doi: 10.3389/fdgth.2024.1463419. eCollection 2024.

GEnDDn: An lncRNA-Disease Association Identification Framework Based on Dual-Net Neural Architecture and Deep Neural Network.GEnDDn：一种基于双网络神经架构和深度神经网络的 lncRNA-疾病关联识别框架。

Interdiscip Sci. 2024 Jun;16(2):418-438. doi: 10.1007/s12539-024-00619-w. Epub 2024 May 11.

Finding potential lncRNA-disease associations using a boosting-based ensemble learning model.使用基于提升的集成学习模型寻找潜在的长链非编码RNA-疾病关联。

Front Genet. 2024 Mar 1;15:1356205. doi: 10.3389/fgene.2024.1356205. eCollection 2024.

Applying negative sample denoising and multi-view feature for lncRNA-disease association prediction.应用负样本去噪和多视图特征进行lncRNA-疾病关联预测。

Front Genet. 2024 Jan 9;14:1332273. doi: 10.3389/fgene.2023.1332273. eCollection 2023.

本文引用的文献

Learning to SMILES: BAN-based strategies to improve latent representation learning from molecules.从分子中学习 SMILES：基于 BAN 的策略来改进潜在表示学习。

Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab327.

Neural inductive matrix completion with graph convolutional networks for miRNA-disease association prediction.基于图卷积网络的神经归纳矩阵补全在 miRNA-疾病关联预测中的应用。

Bioinformatics. 2020 Apr 15;36(8):2538-2546. doi: 10.1093/bioinformatics/btz965.

Deep learning of pharmacogenomics resources: moving towards precision oncology.基于药理学基因组学资源的深度学习：迈向精准肿瘤学。

Brief Bioinform. 2020 Dec 1;21(6):2066-2083. doi: 10.1093/bib/bbz144.

An efficient approach based on multi-sources information to predict circRNA-disease associations using deep convolutional neural network.基于多源信息的深度学习卷积神经网络预测 circRNA 疾病关联的有效方法。

Bioinformatics. 2020 Jul 1;36(13):4038-4046. doi: 10.1093/bioinformatics/btz825.

Graph Convolutional Network and Convolutional Neural Network Based Method for Predicting lncRNA-Disease Associations.基于图卷积网络和卷积神经网络的 lncRNA-疾病关联预测方法。

Cells. 2019 Aug 30;8(9):1012. doi: 10.3390/cells8091012.

Weighted matrix factorization on multi-relational data for LncRNA-disease association prediction.基于多关系数据的加权矩阵分解预测 LncRNA 疾病关联。

Methods. 2020 Feb 15;173:32-43. doi: 10.1016/j.ymeth.2019.06.015. Epub 2019 Jun 18.

Prediction of lncRNA-disease associations based on inductive matrix completion.基于归纳矩阵补全的 lncRNA-疾病关联预测。

Bioinformatics. 2018 Oct 1;34(19):3357-3364. doi: 10.1093/bioinformatics/bty327.

GRMDA: Graph Regression for MiRNA-Disease Association Prediction.GRMDA：用于miRNA-疾病关联预测的图回归法

Front Physiol. 2018 Feb 20;9:92. doi: 10.3389/fphys.2018.00092. eCollection 2018.

TPGLDA: Novel prediction of associations between lncRNAs and diseases via lncRNA-disease-gene tripartite graph.TPGLDA：基于 lncRNA-疾病-基因三节点图预测 lncRNA 与疾病的关联

Sci Rep. 2018 Jan 18;8(1):1065. doi: 10.1038/s41598-018-19357-3.

Matrix factorization-based data fusion for the prediction of lncRNA-disease associations.基于矩阵分解的数据融合方法用于 lncRNA-疾病关联预测。

Bioinformatics. 2018 May 1;34(9):1529-1537. doi: 10.1093/bioinformatics/btx794.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

LDAEXC：基于深度自动编码器和 XGBoost 分类器的长链非编码 RNA-疾病关联预测。

LDAEXC: LncRNA-Disease Associations Prediction with Deep Autoencoder and XGBoost Classifier.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献