Zhou Liqian, Peng Xinhuai, Zeng Lijun, Peng Lihong
School of Computer Science, Hunan University of Technology, Zhuzhou, Hunan, China.
School of Computer Science, Hunan Institute of Technology, Hengyang, China.
Front Genet. 2024 Mar 1;15:1356205. doi: 10.3389/fgene.2024.1356205. eCollection 2024.
Long non-coding RNAs (lncRNAs) have been in the clinical use as potential prognostic biomarkers of various types of cancer. Identifying associations between lncRNAs and diseases helps capture the potential biomarkers and design efficient therapeutic options for diseases. Wet experiments for identifying these associations are costly and laborious. We developed LDA-SABC, a novel boosting-based framework for lncRNA-disease association (LDA) prediction. LDA-SABC extracts LDA features based on singular value decomposition (SVD) and classifies lncRNA-disease pairs (LDPs) by incorporating LightGBM and AdaBoost into the convolutional neural network. The LDA-SABC performance was evaluated under five-fold cross validations (CVs) on lncRNAs, diseases, and LDPs. It obviously outperformed four other classical LDA inference methods (SDLDA, LDNFSGB, LDASR, and IPCAF) through precision, recall, accuracy, F1 score, AUC, and AUPR. Based on the accurate LDA prediction performance of LDA-SABC, we used it to find potential lncRNA biomarkers for lung cancer. The results elucidated that 7SK and HULC could have a relationship with non-small-cell lung cancer (NSCLC) and lung adenocarcinoma (LUAD), respectively. We hope that our proposed LDA-SABC method can help improve the LDA identification.
长链非编码RNA(lncRNAs)已在临床中用作各种癌症的潜在预后生物标志物。识别lncRNAs与疾病之间的关联有助于发现潜在的生物标志物,并为疾病设计有效的治疗方案。通过湿实验来识别这些关联既昂贵又费力。我们开发了LDA-SABC,这是一种用于lncRNA-疾病关联(LDA)预测的基于增强学习的新型框架。LDA-SABC基于奇异值分解(SVD)提取LDA特征,并通过将LightGBM和AdaBoost纳入卷积神经网络对lncRNA-疾病对(LDPs)进行分类。在对lncRNAs、疾病和LDPs进行五重交叉验证(CVs)的情况下评估了LDA-SABC的性能。通过精确率、召回率、准确率、F₁分数、AUC和AUPR,它明显优于其他四种经典的LDA推理方法(SDLDA、LDNFSGB、LDASR和IPCAF)。基于LDA-SABC准确的LDA预测性能,我们用它来寻找肺癌潜在的lncRNA生物标志物。结果表明,7SK和HULC可能分别与非小细胞肺癌(NSCLC)和肺腺癌(LUAD)有关。我们希望我们提出的LDA-SABC方法能够帮助改进LDA识别。