Persson Hodén Kristian, Hu Xinyi, Martinez German, Dixelius Christina
The Swedish University of Agricultural Sciences, Department of Plant Biology, Uppsala BioCenter, Linnean Center for Plant Biology, P.O. Box 7080, S-75007 Uppsala, Sweden.
Int J Mol Sci. 2021 Apr 20;22(8):4267. doi: 10.3390/ijms22084267.
Degradome sequencing is commonly used to generate high-throughput information on mRNA cleavage sites mediated by small RNAs (sRNA). In our datasets of potato (, St) and (Pi), initial predictions generated high numbers of cleavage site predictions, which highlighted the need of improved analytic tools. Here, we present an R package based on a deep learning convolutional neural network (CNN) in a machine learning environment to optimize discrimination of false from true cleavage sites. When applying smartPARE to our datasets on potato during the infection process by the late blight pathogen, 7.3% of all cleavage windows represented true cleavages distributed on 214 sites in and 444 sites in potato. The sRNA landscape of the two organisms is complex with uneven sRNA production and cleavage regions widespread in the two genomes. Multiple targets and several cases of complex regulatory cascades, particularly in potato, was revealed. We conclude that our new analytic approach is useful for anyone working on complex biological systems and with the interest of identifying cleavage sites particularly inferred by sRNA classes beyond miRNAs.
降解组测序通常用于生成关于由小RNA(sRNA)介导的mRNA切割位点的高通量信息。在我们的马铃薯(St)和疫霉菌(Pi)数据集中,最初的预测产生了大量的切割位点预测结果,这突出了改进分析工具的必要性。在这里,我们展示了一个基于深度学习卷积神经网络(CNN)的R包,用于在机器学习环境中优化对真假切割位点的区分。当将smartPARE应用于我们关于马铃薯在晚疫病病原体感染过程中的数据集时,所有切割窗口中有7.3%代表真正的切割,分布在疫霉菌的214个位点和马铃薯的444个位点上。这两种生物体的sRNA格局很复杂,sRNA产生不均匀,切割区域广泛分布在两个基因组中。揭示了多个靶标以及几个复杂调控级联的案例,特别是在马铃薯中。我们得出结论,我们的新分析方法对任何研究复杂生物系统且有兴趣识别特别是由miRNA以外的sRNA类别推断出的切割位点的人都有用。