IEEE Trans Neural Netw Learn Syst. 2024 Jun;35(6):8241-8253. doi: 10.1109/TNNLS.2022.3226301. Epub 2024 Jun 3.
Polyadenylation [Poly(A)] is an essential process during messenger RNA (mRNA) maturation in biological eukaryote systems. Identifying Poly(A) signals (PASs) from the genome level is the key to understanding the mechanism of translation regulation and mRNA metabolism. In this work, we propose a deep dual-dynamic context-aware Poly(A) signal prediction model, called multiscale convolution with self-attention networks (MCANet), to adaptively uncover the spatial-temporal contextual dependence information. Specifically, the model automatically learns and strengthens informative features from the temporalwise and the spatialwise dimension. The identity connectivity performs contextual feature maps of Poly(A) data by direct connections from previous layers to subsequent layers. Then, a fully parametric rectified linear unit (FP-RELU) with dual-dynamic coefficients is devised to make the training of the model easier and enhance the generalization ability. A cross-entropy loss (CL) function is designed to make the model focus on samples that are easy to misclassify. Experiments on different Poly(A) signals demonstrate the superior performance of the proposed MCANet, and an ablation study shows the effectiveness of the network design for the feature learning and prediction of Poly(A) signals.
多聚腺苷酸化 [Poly(A)] 是生物真核系统中信使 RNA (mRNA) 成熟过程中的一个重要过程。从基因组水平上识别 Poly(A) 信号 (PAS) 是理解翻译调控和 mRNA 代谢机制的关键。在这项工作中,我们提出了一种深度双重动态上下文感知的 Poly(A) 信号预测模型,称为多尺度卷积与自注意力网络 (MCANet),以自适应地揭示时空上下文依赖信息。具体来说,该模型自动从时间维和空间维学习和增强有信息量的特征。身份连接通过从前一层到后一层的直接连接对 Poly(A) 数据进行上下文特征图。然后,设计了具有双动态系数的完全参数化修正线性单元 (FP-RELU),以使模型的训练更容易,并增强泛化能力。设计了交叉熵损失 (CL) 函数,以使模型专注于容易误分类的样本。在不同的 Poly(A) 信号上的实验表明了所提出的 MCANet 的优越性能,并且消融研究表明了网络设计对于 Poly(A) 信号的特征学习和预测的有效性。