• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

OmniGA:用于可推广分类模型的优化单变量决策树。

OmniGA: Optimized Omnivariate Decision Trees for Generalizable Classification Models.

机构信息

King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center, Thuwal, 23955-6900, Saudi Arabia.

出版信息

Sci Rep. 2017 Jun 20;7(1):3898. doi: 10.1038/s41598-017-04281-9.

DOI:10.1038/s41598-017-04281-9
PMID:28634344
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5478657/
Abstract

Classification problems from different domains vary in complexity, size, and imbalance of the number of samples from different classes. Although several classification models have been proposed, selecting the right model and parameters for a given classification task to achieve good performance is not trivial. Therefore, there is a constant interest in developing novel robust and efficient models suitable for a great variety of data. Here, we propose OmniGA, a framework for the optimization of omnivariate decision trees based on a parallel genetic algorithm, coupled with deep learning structure and ensemble learning methods. The performance of the OmniGA framework is evaluated on 12 different datasets taken mainly from biomedical problems and compared with the results obtained by several robust and commonly used machine-learning models with optimized parameters. The results show that OmniGA systematically outperformed these models for all the considered datasets, reducing the F score error in the range from 100% to 2.25%, compared to the best performing model. This demonstrates that OmniGA produces robust models with improved performance. OmniGA code and datasets are available at www.cbrc.kaust.edu.sa/omniga/.

摘要

不同领域的分类问题在复杂性、规模和不同类别样本数量的不平衡性方面存在差异。尽管已经提出了几种分类模型,但为给定的分类任务选择正确的模型和参数以实现良好的性能并非易事。因此,人们一直有兴趣开发适用于各种数据的新型强大且高效的模型。在这里,我们提出了 OmniGA,这是一种基于并行遗传算法优化单变量决策树的框架,结合了深度学习结构和集成学习方法。我们在 12 个不同的数据集上评估了 OmniGA 框架的性能,这些数据集主要来自生物医学问题,并将结果与经过优化参数的几个稳健且常用的机器学习模型进行了比较。结果表明,对于所有考虑的数据集,OmniGA 系统地优于这些模型,与表现最好的模型相比,F 分数误差降低了 100%到 2.25%。这表明 OmniGA 生成了具有改进性能的稳健模型。OmniGA 的代码和数据集可在 www.cbrc.kaust.edu.sa/omniga/ 上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6c3/5478657/1d78e073ce6e/41598_2017_4281_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6c3/5478657/57fe91bbf9b7/41598_2017_4281_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6c3/5478657/00681ba8edae/41598_2017_4281_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6c3/5478657/1413d9b44964/41598_2017_4281_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6c3/5478657/370bdb04ec90/41598_2017_4281_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6c3/5478657/1d78e073ce6e/41598_2017_4281_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6c3/5478657/57fe91bbf9b7/41598_2017_4281_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6c3/5478657/00681ba8edae/41598_2017_4281_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6c3/5478657/1413d9b44964/41598_2017_4281_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6c3/5478657/370bdb04ec90/41598_2017_4281_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6c3/5478657/1d78e073ce6e/41598_2017_4281_Fig5_HTML.jpg

相似文献

1
OmniGA: Optimized Omnivariate Decision Trees for Generalizable Classification Models.OmniGA:用于可推广分类模型的优化单变量决策树。
Sci Rep. 2017 Jun 20;7(1):3898. doi: 10.1038/s41598-017-04281-9.
2
DWFS: a wrapper feature selection tool based on a parallel genetic algorithm.DWFS:一种基于并行遗传算法的包装器特征选择工具。
PLoS One. 2015 Feb 26;10(2):e0117988. doi: 10.1371/journal.pone.0117988. eCollection 2015.
3
A novel end-to-end classifier using domain transferred deep convolutional neural networks for biomedical images.一种使用域转移深度卷积神经网络的新型端到端生物医学图像分类器。
Comput Methods Programs Biomed. 2017 Mar;140:283-293. doi: 10.1016/j.cmpb.2016.12.019. Epub 2017 Jan 6.
4
Classifiability-based omnivariate decision trees.基于可分类性的多变量决策树
IEEE Trans Neural Netw. 2005 Nov;16(6):1547-60. doi: 10.1109/TNN.2005.852864.
5
DEEP: a general computational framework for predicting enhancers.DEEP:一种预测增强子的通用计算框架。
Nucleic Acids Res. 2015 Jan;43(1):e6. doi: 10.1093/nar/gku1058. Epub 2014 Nov 5.
6
IntelliHealth: A medical decision support application using a novel weighted multi-layer classifier ensemble framework.智能健康:一种使用新型加权多层分类器集成框架的医疗决策支持应用程序。
J Biomed Inform. 2016 Feb;59:185-200. doi: 10.1016/j.jbi.2015.12.001. Epub 2015 Dec 15.
7
Binding Activity Prediction of Cyclin-Dependent Inhibitors.细胞周期蛋白依赖性激酶抑制剂的结合活性预测。
J Chem Inf Model. 2015 Jul 27;55(7):1469-82. doi: 10.1021/ci500633c. Epub 2015 Jul 10.
8
The Synthetic Moth: A Neuromorphic Approach toward Artificial Olfaction in Robots合成蛾:一种用于机器人人工嗅觉的神经形态方法
9
Prediction of lung cancer patient survival via supervised machine learning classification techniques.通过监督机器学习分类技术预测肺癌患者的生存情况。
Int J Med Inform. 2017 Dec;108:1-8. doi: 10.1016/j.ijmedinf.2017.09.013. Epub 2017 Sep 25.
10
A novel method for predicting kidney stone type using ensemble learning.一种使用集成学习预测肾结石类型的新方法。
Artif Intell Med. 2018 Jan;84:117-126. doi: 10.1016/j.artmed.2017.12.001. Epub 2017 Dec 11.

引用本文的文献

1
Hybrid non-animal modeling: A mechanistic approach to predict chemical hepatotoxicity.混合非动物模型:预测化学肝毒性的一种机制方法。
J Hazard Mater. 2024 Jun 5;471:134297. doi: 10.1016/j.jhazmat.2024.134297. Epub 2024 Apr 12.
2
In vivo non-invasive staining-free visualization of dermal mast cells in healthy, allergy and mastocytosis humans using two-photon fluorescence lifetime imaging.利用双光子荧光寿命成像技术,在健康、过敏和肥大细胞增多症患者的体内无创性、无染色的情况下可视化皮肤肥大细胞。
Sci Rep. 2020 Sep 10;10(1):14930. doi: 10.1038/s41598-020-71901-2.
3
Splice2Deep: An ensemble of deep convolutional neural networks for improved splice site prediction in genomic DNA.

本文引用的文献

1
Comparative analysis of image classification methods for automatic diagnosis of ophthalmic images.眼科图像自动诊断中图像分类方法的比较分析。
Sci Rep. 2017 Jan 31;7:41545. doi: 10.1038/srep41545.
2
Genetic algorithm for the optimization of features and neural networks in ECG signals classification.基于遗传算法的 ECG 信号分类中特征和神经网络的优化。
Sci Rep. 2017 Jan 31;7:41011. doi: 10.1038/srep41011.
3
Longitudinal measurement and hierarchical classification framework for the prediction of Alzheimer's disease.阿尔茨海默病预测的纵向测量和层次分类框架。
Splice2Deep:用于改进基因组DNA中剪接位点预测的深度卷积神经网络集成方法。
Gene X. 2020 May 13;5:100035. doi: 10.1016/j.gene.2020.100035. eCollection 2020 Dec.
4
DeepGSR: an optimized deep-learning structure for the recognition of genomic signals and regions.DeepGSR:一种用于识别基因组信号和区域的优化深度学习结构。
Bioinformatics. 2019 Apr 1;35(7):1125-1132. doi: 10.1093/bioinformatics/bty752.
5
DPubChem: a web tool for QSAR modeling and high-throughput virtual screening.DPubChem:一个用于定量构效关系建模和高通量虚拟筛选的网络工具。
Sci Rep. 2018 Jun 14;8(1):9110. doi: 10.1038/s41598-018-27495-x.
6
Omni-PolyA: a method and tool for accurate recognition of Poly(A) signals in human genomic DNA.全聚腺苷酸:一种准确识别人类基因组DNA中聚腺苷酸信号的方法和工具。
BMC Genomics. 2017 Aug 15;18(1):620. doi: 10.1186/s12864-017-4033-7.
Sci Rep. 2017 Jan 12;7:39880. doi: 10.1038/srep39880.
4
Drug Response Prediction as a Link Prediction Problem.药物反应预测作为链接预测问题。
Sci Rep. 2017 Jan 9;7:40321. doi: 10.1038/srep40321.
5
Feature selection and classification of urinary mRNA microarray data by iterative random forest to diagnose renal fibrosis: a two-stage study.通过迭代随机森林对尿 mRNA 微阵列数据进行特征选择和分类,以诊断肾纤维化:一项两阶段研究。
Sci Rep. 2017 Jan 3;7:39832. doi: 10.1038/srep39832.
6
Identification of DEP domain-containing proteins by a machine learning method and experimental analysis of their expression in human HCC tissues.通过机器学习方法鉴定 DEP 结构域蛋白,并通过实验分析其在人 HCC 组织中的表达。
Sci Rep. 2016 Dec 21;6:39655. doi: 10.1038/srep39655.
7
A Predictive Model for Toxicity Effects Assessment of Biotransformed Hepatic Drugs Using Iterative Sampling Method.基于迭代抽样方法的生物转化肝毒性药物毒性效应评估的预测模型。
Sci Rep. 2016 Dec 9;6:38660. doi: 10.1038/srep38660.
8
A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification.基于并行 Adaboost-Backpropagation 神经网络的大规模图像数据集分类。
Sci Rep. 2016 Dec 1;6:38201. doi: 10.1038/srep38201.
9
Feature Subset Selection for Cancer Classification Using Weight Local Modularity.基于权重局部模块度的癌症分类特征子集选择
Sci Rep. 2016 Oct 5;6:34759. doi: 10.1038/srep34759.
10
Accuracy Improvement for Predicting Parkinson's Disease Progression.帕金森病进展预测的准确性提升
Sci Rep. 2016 Sep 30;6:34181. doi: 10.1038/srep34181.