Suppr超能文献

BiGM-lncLoc:用于预测细胞特异性长链非编码RNA亚细胞定位的双层多图元学习

BiGM-lncLoc: Bi-level Multi-Graph Meta-Learning for Predicting Cell-Specific Long Noncoding RNAs Subcellular Localization.

作者信息

Deng Xi, Liu Lin

机构信息

School of Information, Yunnan Normal University, Kunming, 650500, China.

Department of Education of Yunnan Province, Engineering Research Center of Computer Vision and Intelligent Control Technology, Kunming, 650500, China.

出版信息

Interdiscip Sci. 2025 Jun;17(2):359-374. doi: 10.1007/s12539-024-00679-y. Epub 2024 Dec 26.

Abstract

The precise spatiotemporal expression of long noncoding RNAs (lncRNAs) plays a pivotal role in biological regulation, and aberrant expression of lncRNAs in different subcellular localizations has been intricately linked to the onset and progression of a variety of cancers. Computational methods provide effective means for predicting lncRNA subcellular localization, but current studies either ignore cell line and tissue specificity or the correlation and shared information among cell lines. In this study, we propose a novel approach, BiGM-lncLoc, treating the prediction of lncRNA subcellular localization across cell lines as a multi-graph meta-learning task. Our investigation involves two categories of data: the localization data of nucleotide sequences in different cell lines and cell line expression data. BiGM-lncLoc comprises a cell line-specific optimization network learning specific knowledge from cell line expression data and a graph neural network optimized across cell lines. Subsequently, the specific and shared knowledge acquired through bi-level optimization is applied to a new cell-line prediction task without the need for re-training or fine-tuning. Additionally, through key feature analysis of the impact of different nucleotide combinations on the model, we confirm the necessity of cell line-specific studies based on correlation analysis. Finally, experiments conducted on various cell lines with different data sizes indicate that BiGM-lncLoc outperforms other methods in terms of prediction accuracy, with an average accuracy of 97.7%. After removing overlapping samples to ensure data independence for each cell line, the accuracy ranged from 82.4% to 94.7%, still surpassing existing models. Our code can be found at https://github.com/BioCL1/BiGM-lncLoc .

摘要

长链非编码RNA(lncRNA)精确的时空表达在生物调控中起着关键作用,lncRNA在不同亚细胞定位中的异常表达与多种癌症的发生和发展密切相关。计算方法为预测lncRNA亚细胞定位提供了有效手段,但目前的研究要么忽略了细胞系和组织特异性,要么忽略了细胞系之间的相关性和共享信息。在本研究中,我们提出了一种新方法BiGM-lncLoc,将跨细胞系的lncRNA亚细胞定位预测视为多图元学习任务。我们的研究涉及两类数据:不同细胞系中核苷酸序列的定位数据和细胞系表达数据。BiGM-lncLoc包括一个从细胞系表达数据中学习特定知识的细胞系特异性优化网络和一个跨细胞系优化的图神经网络。随后,通过双层优化获得的特定和共享知识被应用于新的细胞系预测任务,而无需重新训练或微调。此外,通过对不同核苷酸组合对模型影响的关键特征分析,我们基于相关性分析证实了细胞系特异性研究的必要性。最后,在具有不同数据量的各种细胞系上进行的实验表明,BiGM-lncLoc在预测准确性方面优于其他方法,平均准确率为97.7%。在去除重叠样本以确保每个细胞系的数据独立性后,准确率范围为82.4%至94.7%,仍然超过现有模型。我们的代码可在https://github.com/BioCL1/BiGM-lncLoc上找到。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验