Institute of Artificial Intelligence Application, College of Computer and Information Engineering, Central South University of Forestry and Technology, Changsha, Hunan 410004, China.
Guangzhou Xinhua University, 510520, Guangzhou, China.
Methods. 2024 Jul;227:17-26. doi: 10.1016/j.ymeth.2024.04.018. Epub 2024 May 3.
Messenger RNA (mRNA) is vital for post-transcriptional gene regulation, acting as the direct template for protein synthesis. However, the methods available for predicting mRNA subcellular localization need to be improved and enhanced. Notably, few existing algorithms can annotate mRNA sequences with multiple localizations. In this work, we propose the mRNA-CLA, an innovative multi-label subcellular localization prediction framework for mRNA, leveraging a deep learning approach with a multi-head self-attention mechanism. The framework employs a multi-scale convolutional layer to extract sequence features across different regions and uses a self-attention mechanism explicitly designed for each sequence. Paired with Position Weight Matrices (PWMs) derived from the convolutional neural network layers, our model offers interpretability in the analysis. In particular, we perform a base-level analysis of mRNA sequences from diverse subcellular localizations to determine the nucleotide specificity corresponding to each site. Our evaluations demonstrate that the mRNA-CLA model substantially outperforms existing methods and tools.
信使 RNA(mRNA)在转录后基因调控中至关重要,它作为蛋白质合成的直接模板。然而,现有的预测 mRNA 亚细胞定位的方法需要改进和增强。值得注意的是,很少有现有的算法可以对具有多种定位的 mRNA 序列进行注释。在这项工作中,我们提出了 mRNA-CLA,这是一种用于 mRNA 的创新的多标签亚细胞定位预测框架,利用具有多头自注意力机制的深度学习方法。该框架使用多尺度卷积层来提取不同区域的序列特征,并为每个序列显式设计一个自注意力机制。与来自卷积神经网络层的位置权重矩阵(PWMs)相结合,我们的模型在分析中提供了可解释性。特别是,我们对来自不同亚细胞定位的 mRNA 序列进行基础分析,以确定对应于每个位置的核苷酸特异性。我们的评估表明,mRNA-CLA 模型大大优于现有的方法和工具。