Pancoska Petr, Moravek Zdenek, Moll Ute M
Department of Pathology, SUNY, Stony Brook, NY 11794, USA.
Nucleic Acids Res. 2004 Mar 1;32(4):1469-79. doi: 10.1093/nar/gkh314. Print 2004.
Several aspects of gene silencing by small interfering RNA duplexes (siRNA) influence the efficiency of the silencing. They can be divided into two categories, one covering the cell-specific factors and the other covering molecular factors of the RNA interference (RNAi). A prerequisite for sequence-based siRNA design is that hybridization thermodynamics is the dominant factor. Our assumption is that cell-specific parameters (cell line, degradation, cross-hybridization, target conformation, etc.) can be pooled into an average cellular factor. Our hypothesis is that the molecular basis of the positional dependence of siRNA-induced gene silencing is the uniqueness of context of a corresponding target sequence segment relative to all other such segments along the attacked RNA. We encode this context into descriptors derived from Eulerian graph representation of siRNAs and show that the descriptor based upon the contextual similarity and predicted thermodynamic stability correlates with the experimentally observed silencing efficiency of human lamin A/C gene. We further show that information encoded in this regression function is generalizable and can be used as a predictor of siRNA efficiency in unrelated genes (CD54 and PTEN). In summary, our method represents an evolution of siRNA design from the currently used algorithms which are only qualitative in nature.
小干扰RNA双链体(siRNA)介导的基因沉默的几个方面会影响沉默效率。它们可分为两类,一类涵盖细胞特异性因素,另一类涵盖RNA干扰(RNAi)的分子因素。基于序列的siRNA设计的一个前提是杂交热力学是主导因素。我们的假设是,细胞特异性参数(细胞系、降解、交叉杂交、靶标构象等)可以汇总为一个平均细胞因子。我们的假设是,siRNA诱导基因沉默的位置依赖性的分子基础是相应靶序列片段相对于被攻击RNA上所有其他此类片段的上下文的独特性。我们将这种上下文编码为从siRNA的欧拉图形表示派生的描述符,并表明基于上下文相似性和预测的热力学稳定性的描述符与实验观察到的人类核纤层蛋白A/C基因的沉默效率相关。我们进一步表明,该回归函数中编码的信息具有通用性,可用于预测无关基因(CD54和PTEN)中siRNA的效率。总之,我们的方法代表了siRNA设计从目前本质上仅为定性的算法的一种演进。