ICAR-Indian Agricultural Statistics Research Institute, Indian Council of Agricultural Research, New Delhi 110012, India.
ICAR-National Institute for Plant Biotechnology, Indian Council of Agricultural Research, New Delhi 110012, India.
Int J Mol Sci. 2022 Jan 30;23(3):1612. doi: 10.3390/ijms23031612.
MicroRNAs (miRNAs) play a significant role in plant response to different abiotic stresses. Thus, identification of abiotic stress-responsive miRNAs holds immense importance in crop breeding programmes to develop cultivars resistant to abiotic stresses. In this study, we developed a machine learning-based computational method for prediction of miRNAs associated with abiotic stresses. Three types of datasets were used for prediction, i.e., miRNA, Pre-miRNA, and Pre-miRNA + miRNA. The pseudo -tuple nucleotide compositional features were generated for each sequence to transform the sequence data into numeric feature vectors. Support vector machine (SVM) was employed for prediction. The area under receiver operating characteristics curve (auROC) of 70.21, 69.71, 77.94 and area under precision-recall curve (auPRC) of 69.96, 65.64, 77.32 percentages were obtained for miRNA, Pre-miRNA, and Pre-miRNA + miRNA datasets, respectively. Overall prediction accuracies for the independent test set were 62.33, 64.85, 69.21 percentages, respectively, for the three datasets. The SVM also achieved higher accuracy than other learning methods such as random forest, extreme gradient boosting, and adaptive boosting. To implement our method with ease, an online prediction server "ASRmiRNA" has been developed. The proposed approach is believed to supplement the existing effort for identification of abiotic stress-responsive miRNAs and Pre-miRNAs.
微小 RNA(miRNAs)在植物应对不同非生物胁迫中发挥着重要作用。因此,鉴定非生物胁迫响应 miRNAs 在作物育种计划中具有重要意义,可培育出对非生物胁迫具有抗性的品种。在本研究中,我们开发了一种基于机器学习的计算方法,用于预测与非生物胁迫相关的 miRNAs。我们使用了三种类型的数据集进行预测,即 miRNA、Pre-miRNA 和 Pre-miRNA+miRNA。为每个序列生成伪元组核苷酸组成特征,将序列数据转换为数值特征向量。使用支持向量机(SVM)进行预测。miRNA、Pre-miRNA 和 Pre-miRNA+miRNA 数据集的接收器操作特性曲线下面积(auROC)分别为 70.21%、69.71%和 77.94%,精确召回曲线下面积(auPRC)分别为 69.96%、65.64%和 77.32%。对于三个数据集,独立测试集的总体预测准确率分别为 62.33%、64.85%和 69.21%。SVM 也比随机森林、极端梯度提升和自适应提升等其他学习方法具有更高的准确性。为了方便地实现我们的方法,我们开发了一个在线预测服务器“ASRmiRNA”。该方法有望补充现有识别非生物胁迫响应 miRNAs 和 Pre-miRNAs 的方法。