• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用加权间隙统计量确定聚类的数量。

Determining the number of clusters using the weighted gap statistic.

作者信息

Yan Mingjin, Ye Keying

机构信息

Medtronic Sofamor Danek, 1800 Pyramid Place, Memphis, Tennessee 38132, USA.

出版信息

Biometrics. 2007 Dec;63(4):1031-7. doi: 10.1111/j.1541-0420.2007.00784.x. Epub 2007 Apr 9.

DOI:10.1111/j.1541-0420.2007.00784.x
PMID:17425640
Abstract

Estimating the number of clusters in a data set is a crucial step in cluster analysis. In this article, motivated by the gap method (Tibshirani, Walther, and Hastie, 2001, Journal of the Royal Statistical Society B63, 411-423), we propose the weighted gap and the difference of difference-weighted (DD-weighted) gap methods for estimating the number of clusters in data using the weighted within-clusters sum of errors: a measure of the within-clusters homogeneity. In addition, we propose a "multilayer" clustering approach, which is shown to be more accurate than the original gap method, particularly in detecting the nested cluster structure of the data. The methods are applicable when the input data contain continuous measurements and can be used with any clustering method. Simulation studies and real data are investigated and compared among these proposed methods as well as with the original gap method.

摘要

估计数据集中的聚类数量是聚类分析中的关键步骤。在本文中,受间隙法(Tibshirani、Walther和Hastie,2001年,《皇家统计学会学报B》63卷,411 - 423页)的启发,我们提出了加权间隙法和差分加权(DD加权)间隙法,用于使用加权簇内误差和来估计数据中的聚类数量:这是一种衡量簇内同质性的指标。此外,我们提出了一种“多层”聚类方法,该方法被证明比原始间隙法更准确,特别是在检测数据的嵌套聚类结构方面。这些方法适用于输入数据包含连续测量值的情况,并且可以与任何聚类方法一起使用。我们对这些提出的方法以及原始间隙法进行了模拟研究和实际数据调查与比较。

相似文献

1
Determining the number of clusters using the weighted gap statistic.使用加权间隙统计量确定聚类的数量。
Biometrics. 2007 Dec;63(4):1031-7. doi: 10.1111/j.1541-0420.2007.00784.x. Epub 2007 Apr 9.
2
Weighted rank regression for clustered data analysis.用于聚类数据分析的加权秩回归
Biometrics. 2008 Mar;64(1):39-45. doi: 10.1111/j.1541-0420.2007.00842.x. Epub 2007 Jun 30.
3
Cluster pattern detection in spatial data based on Monte Carlo inference.基于蒙特卡洛推理的空间数据聚类模式检测
Biom J. 2007 Aug;49(4):505-19. doi: 10.1002/bimj.200610326.
4
Cumulative voting consensus method for partitions with variable number of clusters.具有可变聚类数的分区的累积投票共识方法。
IEEE Trans Pattern Anal Mach Intell. 2008 Jan;30(1):160-73. doi: 10.1109/TPAMI.2007.1138.
5
Detecting the number of clusters in n-way probabilistic clustering.检测 n 路概率聚类中的聚类数量。
IEEE Trans Pattern Anal Mach Intell. 2010 Nov;32(11):2006-21. doi: 10.1109/TPAMI.2010.15.
6
Modified fuzzy gap statistic for estimating preferable number of clusters in fuzzy k-means clustering.用于估计模糊k均值聚类中最优聚类数的改进模糊间隙统计量
J Biosci Bioeng. 2008 Mar;105(3):273-81. doi: 10.1263/jbb.105.273.
7
Weighted rank aggregation of cluster validation measures: a Monte Carlo cross-entropy approach.聚类验证指标的加权排序聚合:一种蒙特卡洛交叉熵方法。
Bioinformatics. 2007 Jul 1;23(13):1607-15. doi: 10.1093/bioinformatics/btm158. Epub 2007 May 5.
8
High-dimensional unsupervised selection and estimation of a finite generalized Dirichlet mixture model based on minimum message length.基于最小消息长度的有限广义狄利克雷混合模型的高维无监督选择与估计
IEEE Trans Pattern Anal Mach Intell. 2007 Oct;29(10):1716-31. doi: 10.1109/TPAMI.2007.1095.
9
Graph-based semisupervised learning.基于图的半监督学习。
IEEE Trans Pattern Anal Mach Intell. 2008 Jan;30(1):174-9. doi: 10.1109/TPAMI.2007.70765.
10
Estimating the number of clusters via system evolution for cluster analysis of gene expression data.通过系统进化估计聚类数量用于基因表达数据的聚类分析
IEEE Trans Inf Technol Biomed. 2009 Sep;13(5):848-53. doi: 10.1109/TITB.2009.2025119. Epub 2009 Jun 12.

引用本文的文献

1
Carnivore space use behaviors reveal variation in responses to human land modification.食肉动物的空间利用行为揭示了对人类土地改造反应的差异。
Mov Ecol. 2024 Jul 18;12(1):51. doi: 10.1186/s40462-024-00493-7.
2
RNAlysis: analyze your RNA sequencing data without writing a single line of code.RNAlysis:无需编写任何代码即可分析您的 RNA 测序数据。
BMC Biol. 2023 Apr 7;21(1):74. doi: 10.1186/s12915-023-01574-6.
3
Immune environment and antigen specificity of the T cell receptor repertoire of malignant ascites in ovarian cancer.卵巢癌恶性腹水 T 细胞受体库的免疫环境和抗原特异性。
PLoS One. 2023 Jan 6;18(1):e0279590. doi: 10.1371/journal.pone.0279590. eCollection 2023.
4
Noncanonical β-catenin interactions promote leukemia-initiating activity in early T-cell acute lymphoblastic leukemia.非规范β-连环蛋白相互作用促进早期 T 细胞急性淋巴细胞白血病的白血病起始活性。
Blood. 2023 Mar 30;141(13):1597-1609. doi: 10.1182/blood.2022017079.
5
Behavioural phenotypes of intrinsic motivation in schizophrenia determined by cluster analysis of objectively quantified real-world performance.通过对客观量化的现实世界表现进行聚类分析确定精神分裂症内在动机的行为表型。
Schizophrenia (Heidelb). 2022 Oct 21;8(1):85. doi: 10.1038/s41537-022-00294-0.
6
A machine learning-based SNP-set analysis approach for identifying disease-associated susceptibility loci.基于机器学习的 SNP 集分析方法,用于鉴定与疾病相关的易感性基因座。
Sci Rep. 2022 Sep 22;12(1):15817. doi: 10.1038/s41598-022-19708-1.
7
Clustering analysis revealed the autophagy classification and potential autophagy regulators' sensitivity of pancreatic cancer based on multi-omics data.聚类分析基于多组学数据揭示了胰腺癌的自噬分类和潜在自噬调节剂的敏感性。
Cancer Med. 2023 Jan;12(1):733-746. doi: 10.1002/cam4.4932. Epub 2022 Jun 9.
8
Paradoxical sex-specific patterns of autoantibody response to SARS-CoV-2 infection.感染 SARS-CoV-2 后自身抗体反应出现矛盾的性别特异性模式。
J Transl Med. 2021 Dec 30;19(1):524. doi: 10.1186/s12967-021-03184-8.
9
Replication and cross-validation of type 2 diabetes subtypes based on clinical variables: an IMI-RHAPSODY study.基于临床变量的 2 型糖尿病亚型的复制和验证:一项 IMI-RHAPSODY 研究。
Diabetologia. 2021 Sep;64(9):1982-1989. doi: 10.1007/s00125-021-05490-8. Epub 2021 Jun 10.
10
Corticosteroid Therapy Is Associated With Improved Outcome in Critically Ill Patients With COVID-19 With Hyperinflammatory Phenotype.皮质类固醇治疗与 COVID-19 超高炎症表型危重症患者的改善结局相关。
Chest. 2021 May;159(5):1793-1802. doi: 10.1016/j.chest.2020.11.050. Epub 2020 Dec 13.