Studio of Computational Biology & Bioinformatics, Biotechnology Division, Institute of Himalayan Bioresource Technology, Council of Scientific & Industrial Research, Palampur 176061 (HP), India.
BMC Genomics. 2011 Dec 29;12:636. doi: 10.1186/1471-2164-12-636.
miRNAs are ~21 nucleotide long small noncoding RNA molecules, formed endogenously in most of the eukaryotes, which mainly control their target genes post transcriptionally by interacting and silencing them. While a lot of tools has been developed for animal miRNA target system, plant miRNA target identification system has witnessed limited development. Most of them have been centered around exact complementarity match. Very few of them considered other factors like multiple target sites and role of flanking regions.
In the present work, a Support Vector Regression (SVR) approach has been implemented for plant miRNA target identification, utilizing position specific dinucleotide density variation information around the target sites, to yield highly reliable result. It has been named as p-TAREF (plant-Target Refiner). Performance comparison for p-TAREF was done with other prediction tools for plants with utmost rigor and where p-TAREF was found better performing in several aspects. Further, p-TAREF was run over the experimentally validated miRNA targets from species like Arabidopsis, Medicago, Rice and Tomato, and detected them accurately, suggesting gross usability of p-TAREF for plant species. Using p-TAREF, target identification was done for the complete Rice transcriptome, supported by expression and degradome based data. miR156 was found as an important component of the Rice regulatory system, where control of genes associated with growth and transcription looked predominant. The entire methodology has been implemented in a multi-threaded parallel architecture in Java, to enable fast processing for web-server version as well as standalone version. This also makes it to run even on a simple desktop computer in concurrent mode. It also provides a facility to gather experimental support for predictions made, through on the spot expression data analysis, in its web-server version.
A machine learning multivariate feature tool has been implemented in parallel and locally installable form, for plant miRNA target identification. The performance was assessed and compared through comprehensive testing and benchmarking, suggesting a reliable performance and gross usability for transcriptome wide plant miRNA target identification.
miRNA 是 21 个核苷酸长的小非编码 RNA 分子,在大多数真核生物中内源性形成,主要通过相互作用和沉默来转录后控制其靶基因。虽然已经开发了许多用于动物 miRNA 靶系统的工具,但植物 miRNA 靶标识别系统的发展受到了限制。它们大多集中在精确互补匹配上。很少有考虑其他因素,如多个靶位点和侧翼区域的作用。
在本工作中,实现了一种支持向量回归(SVR)方法,用于植物 miRNA 靶标识别,利用靶位周围位置特异性二核苷酸密度变化信息,产生高度可靠的结果。它被命名为 p-TAREF(plant-Target Refiner)。p-TAREF 与其他植物预测工具进行了性能比较,采用了最严格的方法,结果表明 p-TAREF 在多个方面表现更好。此外,p-TAREF 在拟南芥、紫花苜蓿、水稻和番茄等物种的实验验证的 miRNA 靶标上运行,准确地检测到它们,表明 p-TAREF 对植物物种具有广泛的可用性。使用 p-TAREF,对完整的水稻转录组进行了靶标识别,得到了表达和降解组数据的支持。miR156 被认为是水稻调控系统的一个重要组成部分,其中与生长和转录相关的基因的调控占据主导地位。整个方法学在 Java 中以多线程并行架构实现,以实现 Web 服务器版本和独立版本的快速处理。这也使得它即使在简单的桌面计算机上也可以在并发模式下运行。它还提供了一种通过现场表达数据分析为预测提供实验支持的功能,这在其 Web 服务器版本中可用。
实现了一种机器学习多变量特征工具,用于植物 miRNA 靶标识别,以并行和本地安装的形式。通过全面的测试和基准测试评估和比较了性能,表明在全转录组植物 miRNA 靶标识别方面具有可靠的性能和广泛的可用性。