Suppr超能文献

Circ-LocNet:一种用于环状 RNA 亚细胞定位预测的计算框架。

Circ-LocNet: A Computational Framework for Circular RNA Sub-Cellular Localization Prediction.

机构信息

German Research Center for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.

Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany.

出版信息

Int J Mol Sci. 2022 Jul 26;23(15):8221. doi: 10.3390/ijms23158221.

Abstract

Circular ribonucleic acids (circRNAs) are novel non-coding RNAs that emanate from alternative splicing of precursor mRNA in reversed order across exons. Despite the abundant presence of circRNAs in human genes and their involvement in diverse physiological processes, the functionality of most circRNAs remains a mystery. Like other non-coding RNAs, sub-cellular localization knowledge of circRNAs has the aptitude to demystify the influence of circRNAs on protein synthesis, degradation, destination, their association with different diseases, and potential for drug development. To date, wet experimental approaches are being used to detect sub-cellular locations of circular RNAs. These approaches help to elucidate the role of circRNAs as protein scaffolds, RNA-binding protein (RBP) sponges, micro-RNA (miRNA) sponges, parental gene expression modifiers, alternative splicing regulators, and transcription regulators. To complement wet-lab experiments, considering the progress made by machine learning approaches for the determination of sub-cellular localization of other non-coding RNAs, the paper in hand develops a computational framework, Circ-LocNet, to precisely detect circRNA sub-cellular localization. Circ-LocNet performs comprehensive extrinsic evaluation of 7 residue frequency-based, residue order and frequency-based, and physio-chemical property-based sequence descriptors using the five most widely used machine learning classifiers. Further, it explores the performance impact of K-order sequence descriptor fusion where it ensembles similar as well dissimilar genres of statistical representation learning approaches to reap the combined benefits. Considering the diversity of statistical representation learning schemes, it assesses the performance of second-order, third-order, and going all the way up to seventh-order sequence descriptor fusion. A comprehensive empirical evaluation of Circ-LocNet over a newly developed benchmark dataset using different settings reveals that standalone residue frequency-based sequence descriptors and tree-based classifiers are more suitable to predict sub-cellular localization of circular RNAs. Further, K-order heterogeneous sequence descriptors fusion in combination with tree-based classifiers most accurately predict sub-cellular localization of circular RNAs. We anticipate this study will act as a rich baseline and push the development of robust computational methodologies for the accurate sub-cellular localization determination of novel circRNAs.

摘要

环形 RNA(circRNAs)是一类新型的非编码 RNA,它们通过前体 mRNA 的反向剪接,跨越外显子以顺序排列。尽管 circRNAs 在人类基因中大量存在,并参与多种生理过程,但大多数 circRNAs 的功能仍然是一个谜。与其他非编码 RNA 一样,circRNAs 的亚细胞定位知识有能力揭示 circRNAs 对蛋白质合成、降解、去向的影响,以及它们与不同疾病的关联和药物开发的潜力。迄今为止,湿实验方法被用于检测 circRNAs 的亚细胞位置。这些方法有助于阐明 circRNAs 作为蛋白质支架、RNA 结合蛋白(RBP)海绵、微小 RNA(miRNA)海绵、亲本基因表达调节剂、可变剪接调节剂和转录调节剂的作用。为了补充湿实验,考虑到机器学习方法在确定其他非编码 RNA 的亚细胞定位方面所取得的进展,本文开发了一种计算框架 Circ-LocNet,以精确检测 circRNA 的亚细胞定位。Circ-LocNet 使用最广泛使用的五种机器学习分类器,对基于 7 残基频率、残基顺序和频率以及理化性质的序列描述符进行了全面的外部评估。此外,它还探索了 K 阶序列描述符融合的性能影响,其中融合了相似和不同类型的统计表示学习方法,以获得联合收益。考虑到统计表示学习方案的多样性,它评估了二阶、三阶甚至七阶序列描述符融合的性能。通过在新开发的基准数据集上使用不同的设置对 Circ-LocNet 进行全面的实证评估表明,独立的残基频率序列描述符和基于树的分类器更适合预测 circRNAs 的亚细胞定位。此外,基于树的分类器与 K 阶异构序列描述符融合最能准确预测 circRNAs 的亚细胞定位。我们预计这项研究将成为一个丰富的基准,并推动开发用于准确确定新型 circRNAs 亚细胞定位的强大计算方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1207/9329987/6562ae66490c/ijms-23-08221-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验