Ataei Sobhan, Ahmadi Jafar, Marashi Sayed-Amir, Abolhasani Ilia
Department of Genetics and Plant Breeding, Imam Khomeini International University, Qazvin, Iran.
Department of Biotechnology, College of Science, University of Tehran, Tehran, Iran.
PLoS One. 2024 Aug 1;19(8):e0308016. doi: 10.1371/journal.pone.0308016. eCollection 2024.
MicroRNAs (miRNAs) are small noncoding RNAs that play important post-transcriptional regulatory roles in animals and plants. Despite the importance of plant miRNAs, the inherent complexity of miRNA biogenesis in plants hampers the application of standard miRNA prediction tools, which are often optimized for animal sequences. Therefore, computational approaches to predict putative miRNAs (merely) from genomic sequences, regardless of their expression levels or tissue specificity, are of great interest.
Here, we present AmiR-P3, a novel ab initio plant miRNA prediction pipeline that leverages the strengths of various utilities for its key computational steps. Users can readily adjust the prediction criteria based on the state-of-the-art biological knowledge of plant miRNA properties. The pipeline starts with finding the potential homologs of the known plant miRNAs in the input sequence(s) and ensures that they do not overlap with protein-coding regions. Then, by computing the secondary structure of the presumed RNA sequence based on the minimum free energy, a deep learning classification model is employed to predict potential pre-miRNA structures. Finally, a set of criteria is used to select the most likely miRNAs from the set of predicted miRNAs. We show that our method yields acceptable predictions in a variety of plant species.
AmiR-P3 does not (necessarily) require sequencing reads and/or assembled reference genomes, enabling it to identify conserved and novel putative miRNAs from any genomic or transcriptomic sequence. Therefore, AmiR-P3 is suitable for miRNA prediction even in less-studied plants, as it does not require any prior knowledge of the miRNA repertoire of the organism. AmiR-P3 is provided as a docker container, which is a portable and self-contained software package that can be readily installed and run on any platform and is freely available for non-commercial use from: https://hub.docker.com/r/micrornaproject/amir-p3.
微小RNA(miRNA)是一类小的非编码RNA,在动植物中发挥重要的转录后调控作用。尽管植物miRNA很重要,但植物中miRNA生物合成的内在复杂性阻碍了标准miRNA预测工具的应用,这些工具通常是针对动物序列进行优化的。因此,仅从基因组序列预测假定miRNA(而不考虑其表达水平或组织特异性)的计算方法备受关注。
在此,我们展示了AmiR-P3,这是一种新颖的从头开始的植物miRNA预测流程,在其关键计算步骤中利用了各种实用工具的优势。用户可以根据植物miRNA特性的最新生物学知识轻松调整预测标准。该流程首先在输入序列中找到已知植物miRNA的潜在同源物,并确保它们不与蛋白质编码区域重叠。然后,通过基于最小自由能计算假定RNA序列的二级结构,采用深度学习分类模型预测潜在的前体miRNA结构。最后,使用一组标准从预测的miRNA集合中选择最可能的miRNA。我们表明我们的方法在多种植物物种中产生了可接受的预测结果。
AmiR-P3不一定需要测序读数和/或组装的参考基因组,使其能够从任何基因组或转录组序列中识别保守和新颖的假定miRNA。因此,AmiR-P3即使在研究较少的植物中也适用于miRNA预测,因为它不需要关于生物体miRNA库的任何先验知识。AmiR-P3以Docker容器的形式提供,这是一个便携式且自包含的软件包,可以很容易地在任何平台上安装和运行,可从以下网址免费用于非商业用途:https://hub.docker.com/r/micrornaproject/amir-p3 。