Suppr超能文献

pmiRScan:一种基于LightGBM的动物前体微小RNA预测方法。

pmiRScan: a LightGBM based method for prediction of animal pre-miRNAs.

作者信息

Venkatesan Amrit, Basak Jolly, Bahadur Ranjit Prasad

机构信息

Computational Structural Biology Lab, Department of Bioscience and Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal, 721302, India.

Genomics of Plant Stress Biology Lab, Department of Biotechnology, Visva-Bharati, Santiniketan, West Bengal, 731235, India.

出版信息

Funct Integr Genomics. 2025 Jan 9;25(1):9. doi: 10.1007/s10142-025-01527-y.

Abstract

MicroRNAs (miRNA) are categorized as short endogenous non-coding RNAs, which have a significant role in post-transcriptional gene regulation. Identifying new animal precursor miRNA (pre-miRNA) and miRNA is crucial to understand the role of miRNAs in various biological processes including the development of diseases. The present study focuses on the development of a Light Gradient Boost (LGB) based method for the classification of animal pre-miRNAs using various sequence and secondary structural features. In various pre-miRNA families, distinct k-mer repeat signatures with a length of three nucleotides have been identified. Out of nine different classifiers that have been trained and tested in the present study, LGB has an overall better performance with an AUROC of 0.959. In comparison with the existing methods, our method 'pmiRScan' has an overall better performance with accuracy of 0.93, sensitivity of 0.86, specificity of 0.95 and F-score of 0.82. Moreover, pmiRScan effectively classifies pre-miRNAs from four distinct taxonomic groups: mammals, nematodes, molluscs and arthropods. We have used our classifier to predict genome-wide pre-miRNAs in human. We find a total of 313 pre-miRNA candidates using pmiRScan. A total of 180 potential mature miRNAs belonging to 60 distinct miRNA families are extracted from predicted pre-miRNAs; of which 128 were novel and are note reported in miRBase. These discoveries may enhance our current understanding of miRNAs and their targets in human. pmiRScan is freely available at http://www.csb.iitkgp.ac.in/applications/pmiRScan/index.php .

摘要

微小RNA(miRNA)被归类为短链内源性非编码RNA,其在转录后基因调控中发挥着重要作用。识别新的动物前体miRNA(pre-miRNA)和miRNA对于理解miRNA在包括疾病发展在内的各种生物学过程中的作用至关重要。本研究着重于开发一种基于轻梯度提升(LGB)的方法,利用各种序列和二级结构特征对动物pre-miRNA进行分类。在各种pre-miRNA家族中,已鉴定出长度为三个核苷酸的不同k-mer重复特征。在本研究中训练和测试的九种不同分类器中,LGB的整体性能更好,曲线下面积(AUROC)为0.959。与现有方法相比,我们的方法“pmiRScan”整体性能更好,准确率为0.93,灵敏度为0.86,特异性为0.95,F值为0.82。此外,pmiRScan能有效区分来自四个不同分类组的pre-miRNA:哺乳动物、线虫、软体动物和节肢动物。我们已使用我们的分类器预测人类全基因组中的pre-miRNA。使用pmiRScan我们共发现313个pre-miRNA候选物。从预测的pre-miRNA中总共提取了属于60个不同miRNA家族的180个潜在成熟miRNA;其中128个是新的,未在miRBase中报道。这些发现可能会增强我们目前对人类miRNA及其靶标的理解。pmiRScan可在http://www.csb.iitkgp.ac.in/applications/pmiRScan/index.php免费获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验