• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于聚类分析的多重插补框架。

A framework for multiple imputation in cluster analysis.

机构信息

Centre for Research in Environmental Epidemiology, Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain.

出版信息

Am J Epidemiol. 2013 Apr 1;177(7):718-25. doi: 10.1093/aje/kws289. Epub 2013 Feb 27.

DOI:10.1093/aje/kws289
PMID:23445902
Abstract

Multiple imputation is a common technique for dealing with missing values and is mostly applied in regression settings. Its application in cluster analysis problems, where the main objective is to classify individuals into homogenous groups, involves several difficulties which are not well characterized in the current literature. In this paper, we propose a framework for applying multiple imputation to cluster analysis when the original data contain missing values. The proposed framework incorporates the selection of the final number of clusters and a variable reduction procedure, which may be needed in data sets where the ratio of the number of persons to the number of variables is small. We suggest some ways to report how the uncertainty due to multiple imputation of missing data affects the cluster analysis outcomes-namely the final number of clusters, the results of a variable selection procedure (if applied), and the assignment of individuals to clusters. The proposed framework is illustrated with data from the Phenotype and Course of Chronic Obstructive Pulmonary Disease (PAC-COPD) Study (Spain, 2004-2008), which aimed to classify patients with chronic obstructive pulmonary disease into different disease subtypes.

摘要

多重插补是处理缺失值的常用技术,主要应用于回归设置中。在聚类分析问题中,主要目标是将个体分类到同质组中,其应用涉及当前文献中未很好描述的几个困难。在本文中,我们提出了一种在原始数据存在缺失值时将多重插补应用于聚类分析的框架。所提出的框架包含最终聚类数量的选择和变量缩减过程,这在人员数量与变量数量之比较小的数据集中可能是必要的。我们建议了一些报告方法,说明由于缺失数据的多重插补引起的不确定性如何影响聚类分析结果,即最终聚类数量、变量选择过程的结果(如果应用)以及个体到聚类的分配。所提出的框架通过来自慢性阻塞性肺疾病表型和病程(PAC-COPD)研究(西班牙,2004-2008 年)的数据进行说明,该研究旨在将慢性阻塞性肺疾病患者分类为不同的疾病亚型。

相似文献

1
A framework for multiple imputation in cluster analysis.用于聚类分析的多重插补框架。
Am J Epidemiol. 2013 Apr 1;177(7):718-25. doi: 10.1093/aje/kws289. Epub 2013 Feb 27.
2
Imputation strategies for missing continuous outcomes in cluster randomized trials.整群随机试验中连续缺失结局的插补策略。
Biom J. 2008 Jun;50(3):329-45. doi: 10.1002/bimj.200710423.
3
WIMP: web server tool for missing data imputation.WIMP:用于缺失数据插补的网络服务器工具。
Comput Methods Programs Biomed. 2012 Dec;108(3):1247-54. doi: 10.1016/j.cmpb.2012.08.006. Epub 2012 Sep 25.
4
Variable selection under multiple imputation using the bootstrap in a prognostic study.在一项预后研究中使用自抽样法在多重填补下进行变量选择。
BMC Med Res Methodol. 2007 Jul 13;7:33. doi: 10.1186/1471-2288-7-33.
5
Clinical phenotypes of chronic obstructive pulmonary disease and asthma: recent advances.慢性阻塞性肺疾病和哮喘的临床表型:最新进展。
J Allergy Clin Immunol. 2013 Mar;131(3):627-34; quiz 635. doi: 10.1016/j.jaci.2013.01.010. Epub 2013 Jan 26.
6
Simple imputation methods versus direct likelihood analysis for missing item scores in multilevel educational data.简单插补法与直接似然分析在多层次教育数据中缺失项目得分的应用比较。
Behav Res Methods. 2012 Jun;44(2):516-31. doi: 10.3758/s13428-011-0157-x.
7
Dealing with missing data in a multi-question depression scale: a comparison of imputation methods.处理多问题抑郁量表中的缺失数据:插补方法比较
BMC Med Res Methodol. 2006 Dec 13;6:57. doi: 10.1186/1471-2288-6-57.
8
The impact of using different imputation methods for missing quality of life scores on the estimation of the cost-effectiveness of lung-volume-reduction surgery.使用不同插补方法处理生活质量评分缺失值对肺减容手术成本效益估计的影响。
Health Econ. 2009 Jan;18(1):91-101. doi: 10.1002/hec.1347.
9
A classifier ensemble approach for the missing feature problem.分类器集成方法解决缺失特征问题。
Artif Intell Med. 2012 May;55(1):37-50. doi: 10.1016/j.artmed.2011.11.006. Epub 2011 Dec 20.
10
Missing data in the American College of Surgeons National Surgical Quality Improvement Program are not missing at random: implications and potential impact on quality assessments.美国外科医师学会国家手术质量改进计划中的缺失数据并非随机缺失:对质量评估的影响和潜在影响。
J Am Coll Surg. 2010 Feb;210(2):125-139.e2. doi: 10.1016/j.jamcollsurg.2009.10.021.

引用本文的文献

1
Machine learning-based phenogroups and prediction model in patients with functional gastrointestinal disorders to reveal distinct disease subsets associated with gas production.基于机器学习的表型组和预测模型在功能性胃肠疾病患者中的应用,以揭示与气体产生相关的不同疾病亚组。
J Transl Int Med. 2024 Oct 1;12(4):355-366. doi: 10.2478/jtim-2024-0009. eCollection 2024 Sep.
2
Cognitive Profiles are Better Predictors of Literacy Attainment Than Diagnostic Outcomes in Children with High ADHD Symptoms.在患有高度注意缺陷多动障碍(ADHD)症状的儿童中,认知特征比诊断结果更能预测读写能力的获得。
J Autism Dev Disord. 2024 Jun 16. doi: 10.1007/s10803-024-06392-5.
3
Identification of eczema clusters and their association with filaggrin and atopic comorbidities: analysis of five birth cohorts.
识别特应性皮炎聚集及其与丝聚蛋白和特应性共病的关系:五项出生队列研究分析。
Br J Dermatol. 2023 Dec 20;190(1):45-54. doi: 10.1093/bjd/ljad326.
4
Blood biomarker profiles and exceptional longevity: comparison of centenarians and non-centenarians in a 35-year follow-up of the Swedish AMORIS cohort.血液生物标志物谱与超长寿命:在瑞典 AMORIS 队列 35 年随访中百岁老人与非百岁老人的比较。
Geroscience. 2024 Apr;46(2):1693-1702. doi: 10.1007/s11357-023-00936-w. Epub 2023 Sep 19.
5
Incomplete clustering analysis via multiple imputation.通过多重填补进行不完全聚类分析。
J Appl Stat. 2022 Apr 12;50(9):1962-1979. doi: 10.1080/02664763.2022.2060952. eCollection 2023.
6
Machine learning of COVID-19 clinical data identifies population structures with therapeutic potential.对新冠肺炎临床数据进行机器学习可识别具有治疗潜力的人群结构。
iScience. 2022 Jul 15;25(7):104480. doi: 10.1016/j.isci.2022.104480. Epub 2022 May 31.
7
Identifying clinical subtypes in sepsis-survivors with different one-year outcomes: a secondary latent class analysis of the FROG-ICU cohort.识别不同一年结局的脓毒症幸存者的临床亚型:FROG-ICU 队列的二次潜在类别分析。
Crit Care. 2022 Apr 21;26(1):114. doi: 10.1186/s13054-022-03972-8.
8
Modeling Wheezing Spells Identifies Phenotypes with Different Outcomes and Genetic Associates.哮鸣发作建模可识别出具有不同结局和基因关联的表型。
Am J Respir Crit Care Med. 2022 Apr 15;205(8):883-893. doi: 10.1164/rccm.202108-1821OC.
9
Complex post-traumatic stress disorder and post-migration living difficulties in traumatised refugees and asylum seekers: the role of language acquisition and barriers.创伤后应激障碍和创伤后难民及寻求庇护者移民后生活困难的复杂性:语言习得和障碍的作用。
Eur J Psychotraumatol. 2021 Dec 7;12(1):2001190. doi: 10.1080/20008198.2021.2001190. eCollection 2021.
10
Recurrent Severe Preschool Wheeze: From Prespecified Diagnostic Labels to Underlying Endotypes.反复发作性严重学龄前喘息:从预设的诊断标签到潜在的表型分类。
Am J Respir Crit Care Med. 2021 Sep 1;204(5):523-535. doi: 10.1164/rccm.202009-3696OC.