College of Information Engineering, Zhejiang University of Technology, Hangzhou, 310023, China.
College of Information Engineering, Zhejiang University of Technology, Hangzhou, 310023, China.
Anal Biochem. 2022 Oct 1;654:114802. doi: 10.1016/j.ab.2022.114802. Epub 2022 Jul 7.
Knowledge of RNA solvent accessibility has recently become attractive due to the increasing awareness of its importance for key biological process. Accurately predicting the solvent accessibility of RNA is crucial for understanding its 3D structure and biological function. In this study, we develop a novel computational method, termed Mpred, for accurately predicting the solvent accessibility of RNA from sequence-based multi-scale context feature. In Mpred, three single-view features, i.e., base-pairing probabilities, position-specific frequency matrix, and a binary one-hot encoding, are first generated as three feature sources, and immediately concatenated to engender a super feature. Secondly, for the super feature, the matrix-format features of each nucleotide are extracted using an initialized sliding window technique, and regularly stacked into a cube-format feature. Then, using multi-scale context feature extraction strategy, a pyramid feature constructed of contextual feature of four scales related to target nucleotides is extracted from the cube-format feature. Finally, a customized multi-shot neural network framework, which is equipped with four different scales of receptive fields mainly integrating several residual attention blocks, is designed to dig discrimination information from the contextual pyramid feature. Experimental results demonstrate that the proposed Mpred achieve a high prediction performance and outperforms existing state-of-the-art prediction methods of RNA solvent accessibility.
由于越来越意识到 RNA 溶剂可及性对关键生物过程的重要性,最近对其的了解变得很有吸引力。准确预测 RNA 的溶剂可及性对于理解其 3D 结构和生物功能至关重要。在这项研究中,我们开发了一种新的计算方法,称为 Mpred,用于从基于序列的多尺度上下文特征准确预测 RNA 的溶剂可及性。在 Mpred 中,首先生成三个单视图特征,即碱基对概率、位置特定频率矩阵和二进制独热编码,作为三个特征源,并立即连接以生成超级特征。其次,对于超级特征,使用初始化的滑动窗口技术提取每个核苷酸的矩阵格式特征,并将其定期堆叠成立方格式特征。然后,使用多尺度上下文特征提取策略,从立方格式特征中提取与目标核苷酸相关的四个尺度的上下文特征构建的金字塔特征。最后,设计了一个定制的多镜头神经网络框架,该框架配备了四个不同尺度的接收场,主要集成了几个残差注意块,用于从上下文金字塔特征中挖掘判别信息。实验结果表明,所提出的 Mpred 实现了较高的预测性能,并优于现有的 RNA 溶剂可及性预测方法。