Suppr超能文献

基于广义高斯成分密度估计算法预测 microRNA 前体。

Predicting microRNA precursors with a generalized Gaussian components based density estimation algorithm.

机构信息

Department of Computer Science and Information Engineering, National Taiwan University, Taipei 106, Taiwan.

出版信息

BMC Bioinformatics. 2010 Jan 18;11 Suppl 1(Suppl 1):S52. doi: 10.1186/1471-2105-11-S1-S52.

Abstract

BACKGROUND

MicroRNAs (miRNAs) are short non-coding RNA molecules, which play an important role in post-transcriptional regulation of gene expression. There have been many efforts to discover miRNA precursors (pre-miRNAs) over the years. Recently, ab initio approaches have attracted more attention because they do not depend on homology information and provide broader applications than comparative approaches. Kernel based classifiers such as support vector machine (SVM) are extensively adopted in these ab initio approaches due to the prediction performance they achieved. On the other hand, logic based classifiers such as decision tree, of which the constructed model is interpretable, have attracted less attention.

RESULTS

This article reports the design of a predictor of pre-miRNAs with a novel kernel based classifier named the generalized Gaussian density estimator (G2DE) based classifier. The G2DE is a kernel based algorithm designed to provide interpretability by utilizing a few but representative kernels for constructing the classification model. The performance of the proposed predictor has been evaluated with 692 human pre-miRNAs and has been compared with two kernel based and two logic based classifiers. The experimental results show that the proposed predictor is capable of achieving prediction performance comparable to those delivered by the prevailing kernel based classification algorithms, while providing the user with an overall picture of the distribution of the data set.

CONCLUSION

Software predictors that identify pre-miRNAs in genomic sequences have been exploited by biologists to facilitate molecular biology research in recent years. The G2DE employed in this study can deliver prediction accuracy comparable with the state-of-the-art kernel based machine learning algorithms. Furthermore, biologists can obtain valuable insights about the different characteristics of the sequences of pre-miRNAs with the models generated by the G2DE based predictor.

摘要

背景

微小 RNA(miRNAs)是短的非编码 RNA 分子,在基因表达的转录后调控中发挥着重要作用。多年来,人们一直在努力发现 miRNA 前体(pre-miRNAs)。最近,从头开始的方法引起了更多的关注,因为它们不依赖于同源信息,并且比比较方法具有更广泛的应用。基于核的分类器,如支持向量机(SVM),由于它们实现的预测性能,在这些从头开始的方法中得到了广泛的应用。另一方面,基于逻辑的分类器,如决策树,由于其构建的模型是可解释的,因此受到的关注较少。

结果

本文报告了一种基于核的分类器——广义高斯密度估计器(G2DE)的新型分类器,用于预测 pre-miRNAs 的设计。G2DE 是一种基于核的算法,旨在通过利用少量但具有代表性的核来构建分类模型,从而提供可解释性。所提出的预测器的性能已经通过 692 个人类 pre-miRNAs 进行了评估,并与两个基于核的和两个基于逻辑的分类器进行了比较。实验结果表明,所提出的预测器能够达到与现有的基于核的分类算法相当的预测性能,同时为用户提供数据集分布的整体情况。

结论

近年来,生物学家利用识别基因组序列中 pre-miRNAs 的软件预测器来促进分子生物学研究。本研究中使用的 G2DE 可以提供与最先进的基于核的机器学习算法相当的预测精度。此外,生物学家可以通过 G2DE 预测器生成的模型获得有关 pre-miRNAs 序列的不同特征的有价值的见解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3fdf/3009525/3a679b252d48/1471-2105-11-S1-S52-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验