Department of Genetics and Biochemistry, Clemson University, Clemson, SC 29634, USA.
BMC Genomics. 2010 Dec 1;11 Suppl 3(Suppl 3):S2. doi: 10.1186/1471-2164-11-S3-S2.
Short interfering RNAs (siRNAs) can be used to knockdown gene expression in functional genomics. For a target gene of interest, many siRNA molecules may be designed, whereas their efficiency of expression inhibition often varies.
To facilitate gene functional studies, we have developed a new machine learning method to predict siRNA potency based on random forests and support vector machines. Since there were many potential sequence features, random forests were used to select the most relevant features affecting gene expression inhibition. Support vector machine classifiers were then constructed using the selected sequence features for predicting siRNA potency. Interestingly, gene expression inhibition is significantly affected by nucleotide dimer and trimer compositions of siRNA sequence.
The findings in this study should help design potent siRNAs for functional genomics, and might also provide further insights into the molecular mechanism of RNA interference.
短干扰 RNA(siRNA)可用于功能基因组学中的基因表达敲低。对于感兴趣的靶基因,可以设计许多 siRNA 分子,但其表达抑制效率往往不同。
为了促进基因功能研究,我们开发了一种新的机器学习方法,基于随机森林和支持向量机来预测 siRNA 的效力。由于存在许多潜在的序列特征,因此使用随机森林选择最相关的影响基因表达抑制的特征。然后使用选定的序列特征构建支持向量机分类器来预测 siRNA 的效力。有趣的是,siRNA 序列中的核苷酸二聚体和三聚体组成对基因表达抑制有显著影响。
本研究的结果应有助于为功能基因组学设计有效的 siRNA,并且可能为 RNA 干扰的分子机制提供进一步的见解。