Dai Chengqiu, Wang Linna, Deng Yingwei, Gao Xuzhu, Zhang Jingyu
School of Computer Science and Engineering, Hunan Institute of Technology, Hengyang, 421002, Hunan, China.
The Sixth Department of Oncology, Beidahuang Industry Group General Hospital (Heilongjiang Second Cancer Hospital), Harbin, 150088, Heilongjiang, China.
BMC Bioinformatics. 2025 Jul 22;26(1):190. doi: 10.1186/s12859-025-06169-2.
Long non-coding RNAs (lncRNAs) play essential roles in various physiological and pathological processes. Inferring new lncRNA-disease associations (LDAs) not only promotes us to better understand these complex biological processes, but also provides new options for the diagnosis and prevention of diseases.
A novel computational model, LDA-SCGB, is proposed to predict new LDAs. LDA-SCGB first extracts features of each lncRNA-disease pair with singular value decomposition. Next, it classifies unknown lncRNA-disease pairs through the condensed gradient boosting model. The results demonstrated that LDA-SCGB greatly outperformed the other four representative LDA inference methods (SDLDA, LDNFSGB, LDAenDL and LDASR) under 5-fold cross validations on lncRNAs, diseases, and lncRNA-disease pairs on three LDA datasets, which were from lncRNADisease v2.0, MNDR, and lncRNADisease v3.0, respectively. LDA-SCGB was further used to find potential lncRNAs for colorectal cancer, heart failure, and lung adenocarcinoma. The results demonstrated that CCDC26, MIAT, and CCDC26 had higher association probability with colorectal cancer, heart failure, and lung adenocarcinoma, respectively.
We foresee that LDA-SCGB was capable of predicting potential lncRNAs for complex diseases and further assisting in cancer diagnosis and therapy.
长链非编码RNA(lncRNA)在多种生理和病理过程中发挥着重要作用。推断新的lncRNA-疾病关联(LDA)不仅有助于我们更好地理解这些复杂的生物学过程,还为疾病的诊断和预防提供了新的选择。
提出了一种新的计算模型LDA-SCGB来预测新的LDA。LDA-SCGB首先通过奇异值分解提取每个lncRNA-疾病对的特征。接下来,它通过凝聚梯度提升模型对未知的lncRNA-疾病对进行分类。结果表明,在来自lncRNADisease v2.0、MNDR和lncRNADisease v3.0的三个LDA数据集上,在lncRNA、疾病和lncRNA-疾病对的5折交叉验证下,LDA-SCGB的表现大大优于其他四种代表性的LDA推断方法(SDLDA、LDNFSGB、LDAenDL和LDASR)。LDA-SCGB进一步用于寻找结直肠癌、心力衰竭和肺腺癌的潜在lncRNA。结果表明,CCDC26、MIAT和CCDC26分别与结直肠癌、心力衰竭和肺腺癌具有更高的关联概率。
我们预见LDA-SCGB能够预测复杂疾病的潜在lncRNA,并进一步辅助癌症的诊断和治疗。