Suppr超能文献

与相关变量的联合聚类

Joint clustering with correlated variables.

作者信息

Zhang Hongmei, Zou Yubo, Terry Will, Karmaus Wilfried, Arshad Hasan

机构信息

School of Public Health, The University of Memphis, Memphis, TN.

Blue Cross Blue Shield of South Carolina, Columbia, SC.

出版信息

Am Stat. 2019;73(3):296-306. doi: 10.1080/00031305.2018.1424033. Epub 2018 Jul 9.

Abstract

Traditional clustering methods focus on grouping subjects or (dependent) variables assuming independence between the variables. Clusters formed through these approaches can potentially lack homogeneity. This article proposes a joint clustering method by which both variables and subjects are clustered. In each joint cluster (in general composed of a subset of variables and a subset of subjects), there exists a unique association between dependent variables and covariates of interest. To this end, a Bayesian method is designed, in which a semi-parametric model is used to evaluate any unknown relationships between possibly correlated variables and covariates of interest, and a Dirichlet process is utilized to cluster subjects. Compared to existing clustering techniques, the major novelty of the method exists in its ability to improve the homogeneity of clusters, along with the ability to take the correlations between variables into account. Via simulations, we examine the performance and efficiency of the proposed method. Applying the method to cluster allergens and subjects based on the association of wheal size in reaction to allergens with age, we found that a certain pattern of allergic sensitization to a set of allergens has a potential to reduce the occurrence of asthma.

摘要

传统的聚类方法侧重于对个体或(相关)变量进行分组,假定变量之间相互独立。通过这些方法形成的聚类可能缺乏同质性。本文提出了一种联合聚类方法,对变量和个体同时进行聚类。在每个联合聚类中(通常由变量的一个子集和个体的一个子集组成),相关变量与感兴趣的协变量之间存在唯一的关联。为此,设计了一种贝叶斯方法,其中使用半参数模型来评估可能相关的变量与感兴趣的协变量之间的任何未知关系,并利用狄利克雷过程对个体进行聚类。与现有聚类技术相比,该方法的主要新颖之处在于它能够提高聚类的同质性,同时能够考虑变量之间的相关性。通过模拟,我们检验了所提方法的性能和效率。将该方法应用于根据过敏原激发反应中风团大小与年龄的关联对过敏原和个体进行聚类,我们发现对一组过敏原的某种过敏致敏模式有可能降低哮喘的发生率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a12f/7453389/6d9c8af804b9/nihms-1507196-f0001.jpg

相似文献

1
Joint clustering with correlated variables.与相关变量的联合聚类
Am Stat. 2019;73(3):296-306. doi: 10.1080/00031305.2018.1424033. Epub 2018 Jul 9.
2
The nested joint clustering via Dirichlet process mixture model.通过狄利克雷过程混合模型进行嵌套联合聚类。
J Stat Comput Simul. 2019;89(5):815-830. doi: 10.1080/00949655.2019.1572756. Epub 2019 Jan 28.
3
Adjusting background noise in cluster analyses of longitudinal data.在纵向数据的聚类分析中调整背景噪声。
Comput Stat Data Anal. 2017 May;109:93-104. doi: 10.1016/j.csda.2016.11.009. Epub 2016 Nov 27.
10
Bayesian semiparametric joint models for functional predictors.用于功能预测变量的贝叶斯半参数联合模型。
J Am Stat Assoc. 2009;104(485):26-36. doi: 10.1198/jasa.2009.0001. Epub 2012 Jan 1.

引用本文的文献

本文引用的文献

4
Bayesian biclustering of gene expression data.基因表达数据的贝叶斯双聚类分析
BMC Genomics. 2008;9 Suppl 1(Suppl 1):S4. doi: 10.1186/1471-2164-9-S1-S4.
6
BiVisu: software tool for bicluster detection and visualization.BiVisu:用于双聚类检测和可视化的软件工具。
Bioinformatics. 2007 Sep 1;23(17):2342-4. doi: 10.1093/bioinformatics/btm338. Epub 2007 Jun 22.
9
Biclustering of expression data.表达数据的双聚类分析
Proc Int Conf Intell Syst Mol Biol. 2000;8:93-103.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验