Saetrom Pål
Interagon AS, Medisinsk teknisk senter, N-7489 Trondheim, Norway.
Bioinformatics. 2004 Nov 22;20(17):3055-63. doi: 10.1093/bioinformatics/bth364. Epub 2004 Jun 16.
Both small interfering RNAs (siRNAs) and antisense oligonucleotides can selectively block gene expression. Although the two methods rely on different cellular mechanisms, these methods share the common property that not all oligonucleotides (oligos) are equally effective. That is, if mRNA target sites are picked at random, many of the antisense or siRNA oligos will not be effective. Algorithms that can reliably predict the efficacy of candidate oligos can greatly reduce the cost of knockdown experiments, but previous attempts to predict the efficacy of antisense oligos have had limited success. Machine learning has not previously been used to predict siRNA efficacy.
We develop a genetic programming based prediction system that shows promising results on both antisense and siRNA efficacy prediction. We train and evaluate our system on a previously published database of antisense efficacies and our own database of siRNA efficacies collected from the literature. The best models gave an overall correlation between predicted and observed efficacy of 0.46 on both antisense and siRNA data. As a comparison, the best correlations of support vector machine classifiers trained on the same data were 0.40 and 0.30, respectively.
小干扰RNA(siRNA)和反义寡核苷酸都可以选择性地阻断基因表达。尽管这两种方法依赖于不同的细胞机制,但它们有一个共同的特性,即并非所有的寡核苷酸(oligo)都具有同等的效果。也就是说,如果随机选择mRNA靶位点,许多反义或siRNA寡核苷酸将无效。能够可靠地预测候选寡核苷酸功效的算法可以大大降低敲低实验的成本,但先前预测反义寡核苷酸功效的尝试取得的成功有限。机器学习以前尚未用于预测siRNA的功效。
我们开发了一种基于遗传编程的预测系统,该系统在反义和siRNA功效预测方面均显示出有前景的结果。我们在先前发表的反义功效数据库以及我们自己从文献中收集的siRNA功效数据库上对我们的系统进行训练和评估。最佳模型在反义和siRNA数据上预测功效与观察到的功效之间的总体相关性均为0.46。作为比较,在相同数据上训练的支持向量机分类器的最佳相关性分别为0.40和0.30。