Suppr超能文献

POCALI:通过机器学习整合多组学数据对癌症长链非编码RNA进行预测与洞察

POCALI: Prediction and Insight on CAncer LncRNAs by Integrating Multi-Omics Data with Machine Learning.

作者信息

Rao Ziyan, Wu Chenyang, Liao Yunxi, Ye Chuan, Huang Shaodong, Zhao Dongyu

机构信息

Department of Biomedical Informatics, School of Basic Medical Sciences, Peking University, Beijing, 100191, China.

State Key Laboratory of Vascular Homeostasis and Remodeling, Peking University, Beijing, 100191, China.

出版信息

Small Methods. 2025 Jul;9(7):e2401987. doi: 10.1002/smtd.202401987. Epub 2025 May 23.

Abstract

Long non-coding RNAs (lncRNAs) are receiving increasing attention as biomarkers for cancer diagnosis and therapy. Although there are many computational methods to identify cancer lncRNAs, they do not comprehensively integrate multi-omics features for predictions or systematically evaluate the contribution of each omics to the multifaceted landscape of cancer lncRNAs. In this study, an algorithm, POCALI, is developed to identify cancer lncRNAs by integrating 44 omics features across six categories. The contributions of different omics are explored to identifying cancer lncRNAs and, more specifically, how each feature contributes to a single prediction. The model is evaluated and benchmarked POCALI with existing methods. Finally, the cancer phenotype and genomics characteristics of the predicted novel cancer lncRNAs are validated. POCALI identifies secondary structure and gene expression-related features as strong predictors of cancer lncRNAs, and epigenomic features as moderate predictors. POCALI performed better than other methods, especially in terms of sensitivity, and predicted more candidates. Novel POCALI-predicted cancer lncRNAs have strong relationships with cancer phenotypes, similar to known cancer lncRNAs. Overall, this study facilitates the identification of previously undetected cancer lncRNAs and the comprehensive exploration of the multifaceted feature contributions to cancer lncRNA prediction.

摘要

长链非编码RNA(lncRNAs)作为癌症诊断和治疗的生物标志物正受到越来越多的关注。尽管有许多计算方法可用于识别癌症lncRNAs,但它们并未全面整合多组学特征进行预测,也未系统评估每组学对癌症lncRNAs多方面格局的贡献。在本研究中,开发了一种名为POCALI的算法,通过整合六个类别的44种组学特征来识别癌症lncRNAs。探索了不同组学在识别癌症lncRNAs中的贡献,更具体地说,是每种特征如何对单个预测做出贡献。使用现有方法对POCALI模型进行评估和基准测试。最后,对预测的新型癌症lncRNAs的癌症表型和基因组特征进行了验证。POCALI将二级结构和基因表达相关特征识别为癌症lncRNAs的强预测因子,将表观基因组特征识别为中等预测因子。POCALI的表现优于其他方法,尤其是在敏感性方面,并且预测了更多的候选物。与已知的癌症lncRNAs类似,POCALI新预测的癌症lncRNAs与癌症表型有很强的关系。总体而言,本研究有助于识别先前未检测到的癌症lncRNAs,并全面探索多方面特征对癌症lncRNA预测的贡献。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验