• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Letter to the editor: on the stability and ranking of predictors from random forest variable importance measures.致编辑的信:关于随机森林变量重要性度量的预测因子的稳定性和排名。
Brief Bioinform. 2011 Jul;12(4):369-73. doi: 10.1093/bib/bbr016. Epub 2011 Apr 15.
2
Letter to the editor: Stability of Random Forest importance measures.致编辑的信:随机森林重要性度量的稳定性。
Brief Bioinform. 2011 Jan;12(1):86-9. doi: 10.1093/bib/bbq011. Epub 2010 Mar 31.
3
An experimental study of the intrinsic stability of random forest variable importance measures.随机森林变量重要性度量内在稳定性的实验研究
BMC Bioinformatics. 2016 Feb 3;17:60. doi: 10.1186/s12859-016-0900-5.
4
Using a Machine Learning Algorithm to Predict the Likelihood of Presence of Dental Caries among Children Aged 2 to 7.使用机器学习算法预测2至7岁儿童患龋齿的可能性
Dent J (Basel). 2021 Dec 1;9(12):141. doi: 10.3390/dj9120141.
5
Stability of variable importance scores and rankings using statistical learning tools on single-nucleotide polymorphisms and risk factors involved in gene x gene and gene x environment interactions.使用统计学习工具对单核苷酸多态性以及基因×基因和基因×环境相互作用中涉及的风险因素进行可变重要性评分和排名的稳定性。
BMC Proc. 2007;1 Suppl 1(Suppl 1):S58. doi: 10.1186/1753-6561-1-s1-s58. Epub 2007 Dec 18.
6
Predictor correlation impacts machine learning algorithms: implications for genomic studies.预测器相关性影响机器学习算法:对基因组研究的启示。
Bioinformatics. 2009 Aug 1;25(15):1884-90. doi: 10.1093/bioinformatics/btp331. Epub 2009 May 21.
7
Correlation between Harris hip score and gait analysis through artificial intelligence pose estimation in patients after total hip arthroplasty.全髋关节置换术后患者通过人工智能姿势估计进行的步态分析与Harris髋关节评分之间的相关性。
Asian J Surg. 2023 Dec;46(12):5438-5443. doi: 10.1016/j.asjsur.2023.05.107. Epub 2023 Jun 12.
8
Bias in random forest variable importance measures: illustrations, sources and a solution.随机森林变量重要性度量中的偏差:示例、来源及解决方案
BMC Bioinformatics. 2007 Jan 25;8:25. doi: 10.1186/1471-2105-8-25.
9
Bias and Stability of Single Variable Classifiers for Feature Ranking and Selection.用于特征排序与选择的单变量分类器的偏差与稳定性
Expert Syst Appl. 2014 Nov 1;14(15):6945-6958. doi: 10.1016/j.eswa.2014.05.007.
10
Random generalized linear model: a highly accurate and interpretable ensemble predictor.随机广义线性模型:一种高度准确且可解释的集成预测器。
BMC Bioinformatics. 2013 Jan 16;14:5. doi: 10.1186/1471-2105-14-5.

引用本文的文献

1
ESM2_AMP: an interpretable framework for protein-protein interactions prediction and biological mechanism discovery.ESM2_AMP:一种用于蛋白质-蛋白质相互作用预测和生物学机制发现的可解释框架。
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf434.
2
Spatial Distribution of Equid Exposure to spp. in Goiás State, Midwestern Brazil.巴西中西部戈亚斯州马属动物对 spp. 的接触空间分布情况。
Pathogens. 2025 May 2;14(5):449. doi: 10.3390/pathogens14050449.
3
Determinants of life satisfaction in older adults with diabetes in China: a national cross-sectional study.中国老年糖尿病患者生活满意度的影响因素:一项全国性横断面研究。
Front Public Health. 2025 Apr 22;13:1585752. doi: 10.3389/fpubh.2025.1585752. eCollection 2025.
4
Out of (the) bag-encoding categorical predictors impacts out-of-bag samples.对分类预测变量进行袋外编码会影响袋外样本。
PeerJ Comput Sci. 2024 Nov 18;10:e2445. doi: 10.7717/peerj-cs.2445. eCollection 2024.
5
Knowledge-slanted random forest method for high-dimensional data and small sample size with a feature selection application for gene expression data.适用于高维数据和小样本量的知识倾斜随机森林方法及其在基因表达数据特征选择中的应用
BioData Min. 2024 Sep 10;17(1):34. doi: 10.1186/s13040-024-00388-8.
6
Ensemble methods of rank-based trees for single sample classification with gene expression profiles.基于排名的树的集成方法,用于具有基因表达谱的单个样本分类。
J Transl Med. 2024 Feb 7;22(1):140. doi: 10.1186/s12967-024-04940-2.
7
Biogeography and environmental preferences of (Mart.) Becc.(马尔特)贝奇的生物地理学与环境偏好
Ecol Evol. 2023 Nov 27;13(11):e10749. doi: 10.1002/ece3.10749. eCollection 2023 Nov.
8
Unraveling the key drivers of community composition in the agri-food trade network.揭示农业食品贸易网络中群落组成的关键驱动因素。
Sci Rep. 2023 Aug 26;13(1):13966. doi: 10.1038/s41598-023-41038-z.
9
An Identity Recognition Model Based on RF-RFE: Utilizing Eye-Movement Data.一种基于随机森林-递归特征消除法的身份识别模型:利用眼动数据
Behav Sci (Basel). 2023 Jul 26;13(8):620. doi: 10.3390/bs13080620.
10
Exploitation of surrogate variables in random forests for unbiased analysis of mutual impact and importance of features.利用随机森林中的替代变量进行无偏分析,以了解特征之间的相互影响和重要性。
Bioinformatics. 2023 Aug 1;39(8). doi: 10.1093/bioinformatics/btad471.

本文引用的文献

1
Letter to the editor: Stability of Random Forest importance measures.致编辑的信:随机森林重要性度量的稳定性。
Brief Bioinform. 2011 Jan;12(1):86-9. doi: 10.1093/bib/bbq011. Epub 2010 Mar 31.
2
The behaviour of random forest permutation-based variable importance measures under predictor correlation.随机森林排列重要性度量在预测变量相关性下的行为。
BMC Bioinformatics. 2010 Feb 27;11:110. doi: 10.1186/1471-2105-11-110.
3
Stability and aggregation of ranked gene lists.排名基因列表的稳定性和聚集性。
Brief Bioinform. 2009 Sep;10(5):556-68. doi: 10.1093/bib/bbp034.
4
Predictor correlation impacts machine learning algorithms: implications for genomic studies.预测器相关性影响机器学习算法:对基因组研究的启示。
Bioinformatics. 2009 Aug 1;25(15):1884-90. doi: 10.1093/bioinformatics/btp331. Epub 2009 May 21.
5
Performance of random forest when SNPs are in linkage disequilibrium.单核苷酸多态性处于连锁不平衡状态时随机森林的性能。
BMC Bioinformatics. 2009 Mar 5;10:78. doi: 10.1186/1471-2105-10-78.
6
Conditional variable importance for random forests.随机森林的条件变量重要性
BMC Bioinformatics. 2008 Jul 11;9:307. doi: 10.1186/1471-2105-9-307.
7
Bias in random forest variable importance measures: illustrations, sources and a solution.随机森林变量重要性度量中的偏差:示例、来源及解决方案
BMC Bioinformatics. 2007 Jan 25;8:25. doi: 10.1186/1471-2105-8-25.

致编辑的信:关于随机森林变量重要性度量的预测因子的稳定性和排名。

Letter to the editor: on the stability and ranking of predictors from random forest variable importance measures.

出版信息

Brief Bioinform. 2011 Jul;12(4):369-73. doi: 10.1093/bib/bbr016. Epub 2011 Apr 15.

DOI:10.1093/bib/bbr016
PMID:21498552
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3137934/
Abstract

A recent study examined the stability of rankings from random forests using two variable importance measures (mean decrease accuracy (MDA) and mean decrease Gini (MDG)) and concluded that rankings based on the MDG were more robust than MDA. However, studies examining data-specific characteristics on ranking stability have been few. Rankings based on the MDG measure showed sensitivity to within-predictor correlation and differences in category frequencies, even when the number of categories was held constant, and thus may produce spurious results. The MDA measure was robust to these data characteristics. Further, under strong within-predictor correlation, MDG rankings were less stable than those using MDA.

摘要

最近的一项研究使用两种变量重要性度量(平均减少精度(MDA)和平均减少基尼(MDG))来检验随机森林的排名稳定性,并得出结论,基于 MDG 的排名比 MDA 更稳健。然而,关于排名稳定性的特定数据特征的研究很少。即使类别数量保持不变,基于 MDG 度量的排名也对预测器内相关性和类别频率差异敏感,因此可能产生虚假结果。MDA 度量对这些数据特征具有鲁棒性。此外,在强预测器内相关性下,MDG 排名比 MDA 排名稳定性差。