• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种基于遗传算法和世界竞争竞赛算法的机器学习方法,用于在生物应用中选择基因或特征。

A machine learning method based on the genetic and world competitive contests algorithms for selecting genes or features in biological applications.

机构信息

Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran.

Department of Bioinformatics, Biotechnology Research Center, Tabriz Branch, Islamic Azad University, Tabriz, Iran.

出版信息

Sci Rep. 2021 Feb 8;11(1):3349. doi: 10.1038/s41598-021-82796-y.

DOI:10.1038/s41598-021-82796-y
PMID:33558580
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7870651/
Abstract

Gene/feature selection is an essential preprocessing step for creating models using machine learning techniques. It also plays a critical role in different biological applications such as the identification of biomarkers. Although many feature/gene selection algorithms and methods have been introduced, they may suffer from problems such as parameter tuning or low level of performance. To tackle such limitations, in this study, a universal wrapper approach is introduced based on our introduced optimization algorithm and the genetic algorithm (GA). In the proposed approach, candidate solutions have variable lengths, and a support vector machine scores them. To show the usefulness of the method, thirteen classification and regression-based datasets with different properties were chosen from various biological scopes, including drug discovery, cancer diagnostics, clinical applications, etc. Our findings confirmed that the proposed method outperforms most of the other currently used approaches and can also free the users from difficulties related to the tuning of various parameters. As a result, users may optimize their biological applications such as obtaining a biomarker diagnostic kit with the minimum number of genes and maximum separability power.

摘要

基因/特征选择是使用机器学习技术创建模型的必要预处理步骤。它在不同的生物应用中也起着关键作用,如生物标志物的识别。尽管已经引入了许多特征/基因选择算法和方法,但它们可能存在参数调整或性能水平低等问题。为了解决这些限制,本研究基于我们引入的优化算法和遗传算法(GA),提出了一种通用的封装方法。在提出的方法中,候选解决方案的长度可变,支持向量机对其进行评分。为了展示该方法的有用性,从不同的生物学领域选择了 13 个具有不同特性的分类和回归数据集,包括药物发现、癌症诊断、临床应用等。我们的研究结果证实,该方法优于大多数其他当前使用的方法,还可以使用户摆脱与调整各种参数相关的困难。因此,用户可以优化他们的生物应用,例如用最小数量的基因和最大的可分离性获得生物标志物诊断试剂盒。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/69b6fd05c293/41598_2021_82796_Fig18_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/936cf1ff97b2/41598_2021_82796_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/36164e5a4b2c/41598_2021_82796_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/967523169e8f/41598_2021_82796_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/a11b64c06183/41598_2021_82796_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/8c0efe4ded51/41598_2021_82796_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/fc06a583bb19/41598_2021_82796_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/ddff33dfb4f7/41598_2021_82796_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/2d6e72c9aa9b/41598_2021_82796_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/67cd22a02773/41598_2021_82796_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/56249c33932a/41598_2021_82796_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/115d0ecb0923/41598_2021_82796_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/baea9aab7d0e/41598_2021_82796_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/d62160074cf7/41598_2021_82796_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/4bf343050efb/41598_2021_82796_Fig14_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/b70932457d83/41598_2021_82796_Fig15_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/a00605b707cf/41598_2021_82796_Fig16_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/65b7ce77a785/41598_2021_82796_Fig17_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/69b6fd05c293/41598_2021_82796_Fig18_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/936cf1ff97b2/41598_2021_82796_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/36164e5a4b2c/41598_2021_82796_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/967523169e8f/41598_2021_82796_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/a11b64c06183/41598_2021_82796_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/8c0efe4ded51/41598_2021_82796_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/fc06a583bb19/41598_2021_82796_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/ddff33dfb4f7/41598_2021_82796_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/2d6e72c9aa9b/41598_2021_82796_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/67cd22a02773/41598_2021_82796_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/56249c33932a/41598_2021_82796_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/115d0ecb0923/41598_2021_82796_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/baea9aab7d0e/41598_2021_82796_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/d62160074cf7/41598_2021_82796_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/4bf343050efb/41598_2021_82796_Fig14_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/b70932457d83/41598_2021_82796_Fig15_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/a00605b707cf/41598_2021_82796_Fig16_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/65b7ce77a785/41598_2021_82796_Fig17_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cd/7870651/69b6fd05c293/41598_2021_82796_Fig18_HTML.jpg

相似文献

1
A machine learning method based on the genetic and world competitive contests algorithms for selecting genes or features in biological applications.一种基于遗传算法和世界竞争竞赛算法的机器学习方法,用于在生物应用中选择基因或特征。
Sci Rep. 2021 Feb 8;11(1):3349. doi: 10.1038/s41598-021-82796-y.
2
A voting-based machine learning approach for classifying biological and clinical datasets.基于投票的机器学习方法在生物和临床数据集分类中的应用。
BMC Bioinformatics. 2023 Apr 11;24(1):140. doi: 10.1186/s12859-023-05274-4.
3
Upper-Limb Motion Recognition Based on Hybrid Feature Selection: Algorithm Development and Validation.基于混合特征选择的上肢运动识别:算法开发与验证。
JMIR Mhealth Uhealth. 2021 Sep 2;9(9):e24402. doi: 10.2196/24402.
4
Tuning hyperparameters of machine learning algorithms and deep neural networks using metaheuristics: A bioinformatics study on biomedical and biological cases.使用元启发式算法调整机器学习算法和深度神经网络的超参数:生物信息学在生物医学和生物学案例中的研究。
Comput Biol Chem. 2022 Apr;97:107619. doi: 10.1016/j.compbiolchem.2021.107619. Epub 2021 Dec 24.
5
Development of a two-stage gene selection method that incorporates a novel hybrid approach using the cuckoo optimization algorithm and harmony search for cancer classification.一种两阶段基因选择方法的开发,该方法结合了一种使用布谷鸟优化算法和和声搜索的新型混合方法用于癌症分类。
J Biomed Inform. 2017 Mar;67:11-20. doi: 10.1016/j.jbi.2017.01.016. Epub 2017 Feb 3.
6
Multi-Objective Particle Swarm Optimization Approach for Cost-Based Feature Selection in Classification.用于分类中基于成本的特征选择的多目标粒子群优化方法
IEEE/ACM Trans Comput Biol Bioinform. 2017 Jan-Feb;14(1):64-75. doi: 10.1109/TCBB.2015.2476796. Epub 2015 Sep 4.
7
Top scoring pairs for feature selection in machine learning and applications to cancer outcome prediction.机器学习中特征选择的最佳评分对及其在癌症预后预测中的应用。
BMC Bioinformatics. 2011 Sep 23;12:375. doi: 10.1186/1471-2105-12-375.
8
GeFeS: A generalized wrapper feature selection approach for optimizing classification performance.GeFeS:一种用于优化分类性能的广义包装特征选择方法。
Comput Biol Med. 2020 Oct;125:103974. doi: 10.1016/j.compbiomed.2020.103974. Epub 2020 Aug 20.
9
C-HMOSHSSA: Gene selection for cancer classification using multi-objective meta-heuristic and machine learning methods.C-HMOSHSSA:使用多目标元启发式和机器学习方法进行癌症分类的基因选择。
Comput Methods Programs Biomed. 2019 Sep;178:219-235. doi: 10.1016/j.cmpb.2019.06.029. Epub 2019 Jun 29.
10
Machine learning approach to gene essentiality prediction: a review.机器学习在基因必需性预测中的应用:综述。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab128.

引用本文的文献

1
Fire susceptibility assessment in the Carpathians using an interpretable framework.利用可解释框架对喀尔巴阡山脉的火灾易感性进行评估。
Sci Rep. 2025 Aug 18;15(1):30207. doi: 10.1038/s41598-025-10296-4.
2
Transforming Cancer Classification: The Role of Advanced Gene Selection.转变癌症分类:先进基因选择的作用。
Diagnostics (Basel). 2024 Nov 22;14(23):2632. doi: 10.3390/diagnostics14232632.
3
Utilization of Computable Phenotypes in Electronic Health Record Research: A Review and Case Study in Atopic Dermatitis.电子健康记录研究中可计算表型的应用:以特应性皮炎为例的综述与案例研究

本文引用的文献

1
mRNA and microRNA selection for breast cancer molecular subtype stratification using meta-heuristic based algorithms.基于元启发式算法的乳腺癌分子亚型分层的 mRNA 和 microRNA 选择。
Genomics. 2020 Sep;112(5):3207-3217. doi: 10.1016/j.ygeno.2020.06.014. Epub 2020 Jun 9.
2
Introducing a panel for early detection of lung adenocarcinoma by using data integration of genomics, epigenomics, transcriptomics and proteomics.通过基因组学、表观基因组学、转录组学和蛋白质组学数据的综合利用,引入一个用于早期检测肺腺癌的小组。
Exp Mol Pathol. 2020 Feb;112:104360. doi: 10.1016/j.yexmp.2019.104360. Epub 2019 Dec 13.
3
Are screening methods useful in feature selection? An empirical study.
J Invest Dermatol. 2025 May;145(5):1008-1016. doi: 10.1016/j.jid.2024.08.025. Epub 2024 Nov 1.
4
A voting-based machine learning approach for classifying biological and clinical datasets.基于投票的机器学习方法在生物和临床数据集分类中的应用。
BMC Bioinformatics. 2023 Apr 11;24(1):140. doi: 10.1186/s12859-023-05274-4.
5
Risk Stratification for Breast Cancer Patient by Simultaneous Learning of Molecular Subtype and Survival Outcome Using Genetic Algorithm-Based Gene Set Selection.基于遗传算法的基因集选择同时学习分子亚型和生存结果对乳腺癌患者进行风险分层
Cancers (Basel). 2022 Aug 25;14(17):4120. doi: 10.3390/cancers14174120.
6
Structure-based drug repurposing against COVID-19 and emerging infectious diseases: methods, resources and discoveries.基于结构的新冠病毒和新发传染病药物再利用:方法、资源和发现。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab113.
筛查方法在特征选择中有用吗?一项实证研究。
PLoS One. 2019 Sep 11;14(9):e0220842. doi: 10.1371/journal.pone.0220842. eCollection 2019.
4
DrugR+: A comprehensive relational database for drug repurposing, combination therapy, and replacement therapy.DrugR+:一个用于药物重定位、联合治疗和替代治疗的综合关系型数据库。
Comput Biol Med. 2019 Jun;109:254-262. doi: 10.1016/j.compbiomed.2019.05.006. Epub 2019 May 8.
5
Logistic Regression Confined by Cardinality-Constrained Sample and Feature Selection.基于约束样本和特征选择的逻辑回归。
IEEE Trans Pattern Anal Mach Intell. 2020 Jul;42(7):1713-1728. doi: 10.1109/TPAMI.2019.2901688. Epub 2019 Feb 26.
6
Microvesicle Proteomic Profiling of Uterine Liquid Biopsy for Ovarian Cancer Early Detection.基于微囊泡的蛋白组学分析对卵巢癌早期检测的子宫液活检研究。
Mol Cell Proteomics. 2019 May;18(5):865-875. doi: 10.1074/mcp.RA119.001362. Epub 2019 Feb 13.
7
The landscape of selection in 551 esophageal adenocarcinomas defines genomic biomarkers for the clinic.551 例食管腺癌中选择的景观定义了临床基因组生物标志物。
Nat Genet. 2019 Mar;51(3):506-516. doi: 10.1038/s41588-018-0331-5. Epub 2019 Feb 4.
8
Machine Learning Consensus To Predict the Binding to the Androgen Receptor within the CoMPARA Project.机器学习共识预测雄激素受体结合在 CoMPARA 项目中。
J Chem Inf Model. 2019 May 28;59(5):1839-1848. doi: 10.1021/acs.jcim.8b00794. Epub 2019 Feb 11.
9
Automated feature selection of predictors in electronic medical records data.电子病历数据中预测指标的自动特征选择
Biometrics. 2019 Mar;75(1):268-277. doi: 10.1111/biom.12987. Epub 2019 Apr 2.
10
Relief-based feature selection: Introduction and review.基于缓解的特征选择:介绍与综述。
J Biomed Inform. 2018 Sep;85:189-203. doi: 10.1016/j.jbi.2018.07.014. Epub 2018 Jul 18.