Guo Yanbu, Li Chaoyang, Zhou Dongming, Cao Jinde, Liang Hui
College of Software Engineering, Zhengzhou University of Light Industry, Zhengzhou 450002, China.
School of Information Science and Engineering, Yunnan University, Kunming 650500, China.
Neural Netw. 2022 Aug;152:287-299. doi: 10.1016/j.neunet.2022.04.025. Epub 2022 Apr 29.
Accurately predicting Polyadenylation (Poly(A)) signals isthe key to understand the mechanism of translation regulation and mRNA metabolism. However, existing computational algorithms fail to work well for predicting Poly(A) signals due to the vanishing gradient problem when simply increasing the number of layers. In this work, we devise a spatiotemporal context-aware neural model called ACNet for Poly(A) signal prediction based on co-occurrence embedding. Specifically, genomic sequences of Poly(A) signals are first split into k-mer sequences, and k-mer embeddings are pre-trained based on the co-occurrence matrix information; Then, gated residual networks are devised to fully extract spatial information, which has an excellent ability to control the information flow and ease the problem of vanishing gradients. The gated mechanism generates channel weights by a dilated convolution and aggregates local features by identity connections which are obtained by multi-scale dilated convolutions. Experimental results indicate that our ACNet model outperforms the state-of-the-art prediction methods on various Poly(A) signal data, and an ablation study shows the effectiveness of the design strategy.
准确预测聚腺苷酸化(Poly(A))信号是理解翻译调控机制和mRNA代谢的关键。然而,现有的计算算法在预测Poly(A)信号时,由于简单增加层数会出现梯度消失问题,效果不佳。在这项工作中,我们基于共现嵌入设计了一种用于Poly(A)信号预测的时空上下文感知神经模型ACNet。具体而言,首先将Poly(A)信号的基因组序列拆分为k-mer序列,并基于共现矩阵信息对k-mer嵌入进行预训练;然后,设计门控残差网络以充分提取空间信息,该网络具有出色的信息流控制能力并缓解梯度消失问题。门控机制通过扩张卷积生成通道权重,并通过多尺度扩张卷积获得的恒等连接聚合局部特征。实验结果表明,我们的ACNet模型在各种Poly(A)信号数据上优于当前最先进的预测方法,消融研究表明了该设计策略的有效性。