Department of Bioinformatics, Tongji University, Shanghai, China.
PLoS One. 2012;7(5):e37879. doi: 10.1371/journal.pone.0037879. Epub 2012 May 24.
RNA interference via exogenous short interference RNAs (siRNA) is increasingly more widely employed as a tool in gene function studies, drug target discovery and disease treatment. Currently there is a strong need for rational siRNA design to achieve more reliable and specific gene silencing; and to keep up with the increasing needs for a wider range of applications. While progress has been made in the ability to design siRNAs with specific targets, we are clearly at an infancy stage towards achieving rational design of siRNAs with high efficacy. Among the many obstacles to overcome, lack of general understanding of what sequence features of siRNAs may affect their silencing efficacy and of large-scale homogeneous data needed to carry out such association analyses represents two challenges. To address these issues, we investigated a feature-selection based in-silico siRNA design from a novel cross-platform data integration perspective. An integration analysis of 4,482 siRNAs from ten meta-datasets was conducted for ranking siRNA features, according to their possible importance to the silencing efficacy of siRNAs across heterogeneous data sources. Our ranking analysis revealed for the first time the most relevant features based on cross-platform experiments, which compares favorably with the traditional in-silico siRNA feature screening based on the small samples of individual platform data. We believe that our feature ranking analysis can offer more creditable suggestions to help improving the design of siRNA with specific silencing targets. Data and scripts are available at http://csbl.bmb.uga.edu/publications/materials/qiliu/siRNA.html.
通过外源短干扰 RNA(siRNA)的 RNA 干扰越来越广泛地被用作基因功能研究、药物靶点发现和疾病治疗的工具。目前,迫切需要合理的 siRNA 设计,以实现更可靠和更特异的基因沉默,并满足更广泛应用的需求。虽然在针对特定靶标的 siRNA 设计方面已经取得了进展,但我们显然还处于实现高效 siRNA 合理设计的初级阶段。在克服的诸多障碍中,缺乏对 siRNA 序列特征可能影响其沉默效果的一般认识,以及进行此类关联分析所需的大规模同质数据,是两个挑战。为了解决这些问题,我们从新的跨平台数据集成视角研究了基于特征选择的 siRNA 设计。根据跨异质数据源的 siRNA 沉默效果,对来自十个元数据集的 4482 个 siRNA 进行了特征选择的集成分析,对 siRNA 特征进行了排序。我们的排序分析首次基于跨平台实验揭示了最相关的特征,与基于单个平台数据的小样本的传统 siRNA 特征筛选相比,具有优势。我们相信,我们的特征排序分析可以提供更可信的建议,帮助改进具有特定沉默靶标的 siRNA 的设计。数据和脚本可在 http://csbl.bmb.uga.edu/publications/materials/qiliu/siRNA.html 上获得。