Northeast Normal University, School of Computer Science and Information Technology, Changchun, 130117, China.
Northeast Normal University, School of Environment, Changchun, 130117, China.
Sci Rep. 2017 Mar 20;7:44836. doi: 10.1038/srep44836.
Small interfering RNAs (siRNAs) may induce to targeted gene knockdown, and the gene silencing effectiveness relies on the efficacy of the siRNA. Therefore, the task of this paper is to construct an effective siRNA prediction method. In our work, we try to describe siRNA from both quantitative and qualitative aspects. For quantitative analyses, we form four groups of effective features, including nucleotide frequencies, thermodynamic stability profile, thermodynamic of siRNA-mRNA interaction, and mRNA related features, as a new mixed representation, in which thermodynamic of siRNA-mRNA interaction is introduced to siRNA efficacy prediction for the first time to our best knowledge. And then an F-score based feature selection is employed to investigate the contribution of each feature and remove the weak relevant features. Meanwhile, we encode the siRNA sequence and existed empirical design rules as a qualitative siRNA representation. These two kinds of siRNA representations are combined to predict siRNA efficacy by supported Vector Regression (SVR) at score level. The experimental results indicate that our method may select the features with powerful discriminative ability and make the two kinds of siRNA representations work at full capacity. The prediction results also demonstrate that our method can outperform other popular siRNA efficacy prediction algorithms.
小干扰 RNA(siRNA)可诱导靶向基因敲低,基因沉默效果取决于 siRNA 的功效。因此,本文的任务是构建一种有效的 siRNA 预测方法。在我们的工作中,我们试图从定量和定性两个方面来描述 siRNA。对于定量分析,我们形成了四组有效的特征,包括核苷酸频率、热力学稳定性分布、siRNA-mRNA 相互作用的热力学和 mRNA 相关特征,作为一种新的混合表示,其中 siRNA-mRNA 相互作用的热力学是首次引入到 siRNA 功效预测中,据我们所知。然后,我们采用基于 F 分数的特征选择来研究每个特征的贡献,并去除弱相关特征。同时,我们将 siRNA 序列和已有的经验设计规则编码为定性 siRNA 表示。这两种 siRNA 表示通过支持向量回归(SVR)在评分水平上进行组合,以预测 siRNA 的功效。实验结果表明,我们的方法可以选择具有强大判别能力的特征,并使这两种 siRNA 表示充分发挥作用。预测结果还表明,我们的方法可以优于其他流行的 siRNA 功效预测算法。