Department of Systems and Computer Engineering, Carleton University, Ottawa, K1S 5B6, Canada.
Saint-Jean-sur-Richelieu Research and Development Center, Agriculture and Agri-Food Canada, Saint-Jean-sur-Richelieu, J3B 7B5, Canada.
Sci Rep. 2023 Jan 6;13(1):332. doi: 10.1038/s41598-022-27283-8.
microRNAs (miRNAs) are small non-coding ribonucleic acids that post-transcriptionally regulate gene expression through the targeting of messenger RNA (mRNAs). Most miRNA target predictors have focused on animal species and prediction performance drops substantially when applied to plant species. Several rule-based miRNA target predictors have been developed in plant species, but they often fail to discover new miRNA targets with non-canonical miRNA-mRNA binding. Here, the recently published TarDB database of plant miRNA-mRNA data is leveraged to retrain the TarPmiR miRNA target predictor for application on plant species. Rigorous experiment design across four plant test species demonstrates that animal-trained predictors fail to sustain performance on plant species, and that the use of plant-specific training data improves accuracy depending on the quantity of plant training data used. Surprisingly, our results indicate that the complete exclusion of animal training data leads to the most accurate plant-specific miRNA target predictor indicating that animal-based data may detract from miRNA target prediction in plants. Our final plant-specific miRNA prediction method, dubbed P-TarPmiR, is freely available for use at http://ptarpmir.cu-bic.ca . The final P-TarPmiR method is used to predict targets for all miRNA within the soybean genome. Those ranked predictions, together with GO term enrichment, are shared with the research community.
microRNAs (miRNAs) 是小的非编码核糖核酸,通过靶向信使 RNA (mRNA) 来进行转录后基因表达调控。大多数 miRNA 靶标预测器都集中在动物物种上,当应用于植物物种时,预测性能会大幅下降。已经在植物物种中开发了几种基于规则的 miRNA 靶标预测器,但它们往往无法发现具有非典型 miRNA-mRNA 结合的新 miRNA 靶标。在这里,利用最近发布的植物 miRNA-mRNA 数据 TarDB 数据库来重新训练 TarPmiR miRNA 靶标预测器,以应用于植物物种。在四个植物测试物种中进行的严格实验设计表明,动物训练的预测器在植物物种上的性能无法持续,并且使用植物特异性训练数据可以提高准确性,具体取决于使用的植物训练数据量。令人惊讶的是,我们的结果表明,完全排除动物训练数据会导致最准确的植物特异性 miRNA 靶标预测器,这表明动物数据可能会损害植物中 miRNA 靶标预测。我们最终的植物特异性 miRNA 预测方法,称为 P-TarPmiR,可在 http://ptarpmir.cu-bic.ca 免费使用。最终的 P-TarPmiR 方法用于预测大豆基因组中所有 miRNA 的靶标。这些排名预测以及 GO 术语富集,与研究界共享。