Zhang Cece, Zhu Xuehuan, Peterson Nick, Wang Jieqiong, Wan Shibiao
Department of Cell & Systems Biology, University of Toronto, ON, Canada.
School of Engineering, University of California, Los Angeles, CA, United States.
ArXiv. 2025 Apr 24:arXiv:2504.17162v1.
The subcellular localization of RNAs, including long non-coding RNAs (lncRNAs), messenger RNAs (mRNAs), microRNAs (miRNAs) and other smaller RNAs, plays a critical role in determining their biological functions. For instance, lncRNAs are predominantly associated with chromatin and act as regulators of gene transcription and chromatin structure, while mRNAs are distributed across the nucleus and cytoplasm, facilitating the transport of genetic information for protein synthesis. Understanding RNA localization sheds light on processes like gene expression regulation with spatial and temporal precision. However, traditional wet lab methods for determining RNA localization, such as in situ hybridization, are often time-consuming, resource-demanding, and costly. To overcome these challenges, computational methods leveraging artificial intelligence (AI) and machine learning (ML) have emerged as powerful alternatives, enabling large-scale prediction of RNA subcellular localization. This paper provides a comprehensive review of the latest advancements in AI-based approaches for RNA subcellular localization prediction, covering various RNA types and focusing on sequence-based, image-based, and hybrid methodologies that combine both data types. We highlight the potential of these methods to accelerate RNA research, uncover molecular pathways, and guide targeted disease treatments. Furthermore, we critically discuss the challenges in AI/ML approaches for RNA subcellular localization, such as data scarcity and lack of benchmarks, and opportunities to address them. This review aims to serve as a valuable resource for researchers seeking to develop innovative solutions in the field of RNA subcellular localization and beyond.
RNA的亚细胞定位,包括长链非编码RNA(lncRNA)、信使RNA(mRNA)、微小RNA(miRNA)和其他较小的RNA,在决定它们的生物学功能方面起着关键作用。例如,lncRNA主要与染色质相关,并作为基因转录和染色质结构的调节因子,而mRNA分布在细胞核和细胞质中,促进遗传信息用于蛋白质合成的运输。了解RNA定位有助于精确地从空间和时间上揭示基因表达调控等过程。然而,传统的用于确定RNA定位的湿实验室方法,如原位杂交,往往耗时、资源需求大且成本高。为了克服这些挑战,利用人工智能(AI)和机器学习(ML)的计算方法已成为强大的替代方案,能够大规模预测RNA亚细胞定位。本文全面综述了基于AI的RNA亚细胞定位预测方法的最新进展,涵盖了各种RNA类型,并重点介绍了基于序列、基于图像以及结合这两种数据类型的混合方法。我们强调了这些方法在加速RNA研究、揭示分子途径和指导靶向疾病治疗方面的潜力。此外,我们批判性地讨论了AI/ML方法在RNA亚细胞定位方面面临的挑战,如数据稀缺和缺乏基准,以及解决这些问题的机会。这篇综述旨在为寻求在RNA亚细胞定位及其他领域开发创新解决方案的研究人员提供有价值的资源。