• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用全局和内在折叠度量对来自基因组假发夹结构的前体微小RNA进行从头支持向量机分类。

De novo SVM classification of precursor microRNAs from genomic pseudo hairpins using global and intrinsic folding measures.

作者信息

Ng Kwang Loong Stanley, Mishra Santosh K

机构信息

Bioinformatics Institute, Singapore.

出版信息

Bioinformatics. 2007 Jun 1;23(11):1321-30. doi: 10.1093/bioinformatics/btm026. Epub 2007 Jan 31.

DOI:10.1093/bioinformatics/btm026
PMID:17267435
Abstract

MOTIVATION

MicroRNAs (miRNAs) are small ncRNAs participating in diverse cellular and physiological processes through the post-transcriptional gene regulatory pathway. Critically associated with the miRNAs biogenesis, the hairpin structure is a necessary feature for the computational classification of novel precursor miRNAs (pre-miRs). Though many of the abundant genomic inverted repeats (pseudo hairpins) can be filtered computationally, novel species-specific pre-miRs are likely to remain elusive.

RESULTS

miPred is a de novo Support Vector Machine (SVM) classifier for identifying pre-miRs without relying on phylogenetic conservation. To achieve significantly higher sensitivity and specificity than existing (quasi) de novo predictors, it employs a Gaussian Radial Basis Function kernel (RBF) as a similarity measure for 29 global and intrinsic hairpin folding attributes. They characterize a pre-miR at the dinucleotide sequence, hairpin folding, non-linear statistical thermodynamics and topological levels. Trained on 200 human pre-miRs and 400 pseudo hairpins, miPred achieves 93.50% (5-fold cross-validation accuracy) and 0.9833 (ROC score). Tested on the remaining 123 human pre-miRs and 246 pseudo hairpins, it reports 84.55% (sensitivity), 97.97% (specificity) and 93.50% (accuracy). Validated onto 1918 pre-miRs across 40 non-human species and 3836 pseudo hairpins, it yields 87.65% (92.08%), 97.75% (97.42%) and 94.38% (95.64%) for the mean (overall) sensitivity, specificity and accuracy. Notably, A.mellifera, A.geoffroyi, C.familiaris, E.Barr, H. Simplex virus, H.cytomegalovirus, O.aries, P.patens, R.lymphocryptovirus, Simian virus and Z.mays are unambiguously classified with 100.00% (sensitivity) and >93.75% (specificity).

AVAILABILITY

Data sets, raw statistical results and source codes are available at http://web.bii.a-star.edu.sg/~stanley/Publications

摘要

动机

微小RNA(miRNA)是一类小的非编码RNA,通过转录后基因调控途径参与多种细胞和生理过程。发夹结构与miRNA的生物合成密切相关,是对新型前体miRNA(pre-miR)进行计算分类的必要特征。尽管许多丰富的基因组反向重复序列(假发夹)可以通过计算进行过滤,但新的物种特异性pre-miR可能仍然难以捉摸。

结果

miPred是一种从头开始的支持向量机(SVM)分类器,用于识别pre-miR,而不依赖于系统发育保守性。为了实现比现有的(准)从头预测器更高的灵敏度和特异性,它采用高斯径向基函数核(RBF)作为29种全局和内在发夹折叠属性的相似性度量。它们在二核苷酸序列、发夹折叠、非线性统计热力学和拓扑水平上表征pre-miR。在200个人类pre-miR和400个假发夹上进行训练,miPred的准确率达到93.50%(5折交叉验证准确率),ROC分数为0.9833。在其余123个人类pre-miR和246个假发夹上进行测试,其灵敏度为84.55%,特异性为97.97%,准确率为93.50%。在40种非人类物种的1918个pre-miR和3836个假发夹上进行验证,其平均(总体)灵敏度、特异性和准确率分别为87.65%(92.08%)、97.75%(97.42%)和94.38%(95.64%)。值得注意的是,意大利蜜蜂、白额卷尾猴、家犬、E.Barr病毒、单纯疱疹病毒、巨细胞病毒、绵羊、小立碗藓、淋巴隐病毒、猿猴病毒和玉米均被明确分类,灵敏度为100.00%,特异性>93.75%。

可用性

数据集、原始统计结果和源代码可在http://web.bii.a-star.edu.sg/~stanley/Publications获取

相似文献

1
De novo SVM classification of precursor microRNAs from genomic pseudo hairpins using global and intrinsic folding measures.利用全局和内在折叠度量对来自基因组假发夹结构的前体微小RNA进行从头支持向量机分类。
Bioinformatics. 2007 Jun 1;23(11):1321-30. doi: 10.1093/bioinformatics/btm026. Epub 2007 Jan 31.
2
Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine.利用局部结构序列特征和支持向量机对真实和伪微小RNA前体进行分类
BMC Bioinformatics. 2005 Dec 29;6:310. doi: 10.1186/1471-2105-6-310.
3
Genetic algorithm-based efficient feature selection for classification of pre-miRNAs.基于遗传算法的用于前体微小RNA分类的高效特征选择
Genet Mol Res. 2011 Apr 12;10(2):588-603. doi: 10.4238/vol10-2gmr969.
4
Predicting human microRNA precursors based on an optimized feature subset generated by GA-SVM.基于 GA-SVM 生成的优化特征子集预测人类 microRNA 前体。
Genomics. 2011 Aug;98(2):73-8. doi: 10.1016/j.ygeno.2011.04.011. Epub 2011 May 14.
5
Unique folding of precursor microRNAs: quantitative evidence and implications for de novo identification.前体微小RNA的独特折叠:定量证据及其对从头鉴定的意义
RNA. 2007 Feb;13(2):170-87. doi: 10.1261/rna.223807. Epub 2006 Dec 28.
6
Hairpins in a Haystack: recognizing microRNA precursors in comparative genomics data.大海捞针:在比较基因组学数据中识别微小RNA前体
Bioinformatics. 2006 Jul 15;22(14):e197-202. doi: 10.1093/bioinformatics/btl257.
7
microPred: effective classification of pre-miRNAs for human miRNA gene prediction.microPred:用于人类miRNA基因预测的前体miRNA有效分类
Bioinformatics. 2009 Apr 15;25(8):989-95. doi: 10.1093/bioinformatics/btp107. Epub 2009 Feb 20.
8
PMirP: a pre-microRNA prediction method based on structure-sequence hybrid features.PMirP:一种基于结构-序列混合特征的 miRNA 前体预测方法。
Artif Intell Med. 2010 Jun;49(2):127-32. doi: 10.1016/j.artmed.2010.03.004. Epub 2010 Apr 15.
9
Polymorphisms in human pre-miRNAs.人类前体微小RNA中的多态性。
Biochem Biophys Res Commun. 2005 Jun 17;331(4):1439-44. doi: 10.1016/j.bbrc.2005.04.051.
10
New syntax to describe local continuous structure-sequence information for recognizing new pre-miRNAs.用于识别新的前 miRNA 的描述局部连续结构-序列信息的新语法。
J Theor Biol. 2010 May 21;264(2):578-84. doi: 10.1016/j.jtbi.2010.02.037. Epub 2010 Mar 2.

引用本文的文献

1
A Framework for Race-Specific Prostate Cancer Detection Using Machine Learning Through Gene Expression Data: Feature Selection Optimization Approach.一种通过基因表达数据利用机器学习进行种族特异性前列腺癌检测的框架:特征选择优化方法。
JMIR Bioinform Biotechnol. 2025 Jun 20;6. doi: 10.2196/72423.
2
On the interpretability of the SVM model for predicting infant mortality in Bangladesh.关于 SVM 模型预测孟加拉国婴儿死亡率的可解释性。
J Health Popul Nutr. 2024 Oct 27;43(1):170. doi: 10.1186/s41043-024-00646-9.
3
RNAinsecta: A tool for prediction of precursor microRNA in insects and search for their target in the model organism Drosophila melanogaster.
RNAinsecta:一种用于预测昆虫前体 microRNA 的工具,并在模式生物黑腹果蝇中搜索其靶标。
PLoS One. 2023 Oct 9;18(10):e0287323. doi: 10.1371/journal.pone.0287323. eCollection 2023.
4
LncRNA model predicts liver cancer drug resistance and validate experiments.长链非编码RNA模型预测肝癌耐药性并进行验证实验。
Front Cell Dev Biol. 2023 Apr 3;11:1174183. doi: 10.3389/fcell.2023.1174183. eCollection 2023.
5
De novo assembly and characterization of the draft genome of the cashew (Anacardium occidentale L.).腰果(Anacardium occidentale L.)基因组草图的从头组装与特征分析。
Sci Rep. 2022 Oct 28;12(1):18187. doi: 10.1038/s41598-022-22600-7.
6
A Novel Necroptosis-Related Prognostic Signature of Glioblastoma Based on Transcriptomics Analysis and Single Cell Sequencing Analysis.基于转录组学分析和单细胞测序分析的胶质母细胞瘤新型坏死性凋亡相关预后标志物
Brain Sci. 2022 Jul 26;12(8):988. doi: 10.3390/brainsci12080988.
7
Construction and characterization of a de novo draft genome of garden cress (Lepidium sativum L.).从头构建并解析荠(Lepidium sativum L.)的基因组草图。
Funct Integr Genomics. 2022 Oct;22(5):879-889. doi: 10.1007/s10142-022-00866-4. Epub 2022 May 20.
8
PlantMirP2: An Accurate, Fast and Easy-To-Use Program for Plant Pre-miRNA and miRNA Prediction.植物 MirP2:一种准确、快速且易于使用的植物前体 miRNA 和 miRNA 预测程序。
Genes (Basel). 2021 Aug 21;12(8):1280. doi: 10.3390/genes12081280.
9
A hybrid CNN-LSTM model for pre-miRNA classification.用于 miRNA 前体分类的混合 CNN-LSTM 模型。
Sci Rep. 2021 Jul 8;11(1):14125. doi: 10.1038/s41598-021-93656-0.
10
De novo assembly and characterization of the first draft genome of quince (Cydonia oblonga Mill.).重建并描述绵苹果(Cydonia oblonga Mill.)的首个草图基因组。
Sci Rep. 2021 Feb 15;11(1):3818. doi: 10.1038/s41598-021-83113-3.