Suppr超能文献

将样本相似度纳入潜在类别分析:一种树状收缩方法。

Integrating sample similarities into latent class analysis: a tree-structured shrinkage approach.

机构信息

Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, USA.

Environmental and Occupational Health, Milken Institute School of Public Health, The George Washington University, Washington, District of Columbia, USA.

出版信息

Biometrics. 2023 Mar;79(1):264-279. doi: 10.1111/biom.13580. Epub 2021 Nov 10.

Abstract

This paper is concerned with using multivariate binary observations to estimate the probabilities of unobserved classes with scientific meanings. We focus on the setting where additional information about sample similarities is available and represented by a rooted weighted tree. Every leaf in the given tree contains multiple samples. Shorter distances over the tree between the leaves indicate a priori higher similarity in class probability vectors. We propose a novel data integrative extension to classical latent class models with tree-structured shrinkage. The proposed approach enables (1) borrowing of information across leaves, (2) estimating data-driven leaf groups with distinct vectors of class probabilities, and (3) individual-level probabilistic class assignment given the observed multivariate binary measurements. We derive and implement a scalable posterior inference algorithm in a variational Bayes framework. Extensive simulations show more accurate estimation of class probabilities than alternatives that suboptimally use the additional sample similarity information. A zoonotic infectious disease application is used to illustrate the proposed approach. The paper concludes by a brief discussion on model limitations and extensions.

摘要

本文旨在利用多元二分类观测值来估计具有科学意义的未观测类别的概率。我们关注的是存在关于样本相似性的附加信息且其由有根加权树表示的设定。给定树中的每个叶子包含多个样本。叶子之间的树距离越短,则先验类概率向量的相似度越高。我们提出了一种新颖的数据集成方法,将其应用于具有树结构收缩的经典潜在类别模型中。所提出的方法能够(1)在叶子之间进行信息借用,(2)估计具有不同类概率向量的数据驱动叶子组,以及(3)基于观察到的多元二分类测量值进行个体级别的概率分类分配。我们在变分贝叶斯框架中推导出并实现了一个可扩展的后验推断算法。广泛的模拟结果表明,该方法比那些次优地利用附加样本相似性信息的替代方法能够更准确地估计类概率。一个人畜共患传染病应用案例用于说明所提出的方法。最后,文章简要讨论了模型的局限性和扩展。

相似文献

7
Latent-space variational bayes.潜在空间变分贝叶斯
IEEE Trans Pattern Anal Mach Intell. 2008 Dec;30(12):2236-42. doi: 10.1109/TPAMI.2008.157.
10
Greedy learning of binary latent trees.贪婪学习二进制潜在树。
IEEE Trans Pattern Anal Mach Intell. 2011 Jun;33(6):1087-97. doi: 10.1109/TPAMI.2010.145.

本文引用的文献

3
TENSOR DECOMPOSITIONS AND SPARSE LOG-LINEAR MODELS.张量分解与稀疏对数线性模型
Ann Stat. 2017;45(1):1-38. doi: 10.1214/15-AOS1414. Epub 2017 Feb 21.
5
Logistic Stick-Breaking Process.逻辑折断过程
J Mach Learn Res. 2011 Jan;12(Jan):203-239.
8
Nonparametric Bayes Modeling of Multivariate Categorical Data.多变量分类数据的非参数贝叶斯建模
J Am Stat Assoc. 2012 Jan 1;104(487):1042-1051. doi: 10.1198/jasa.2009.tm08439.
9
Insights into latent class analysis of diagnostic test performance.诊断试验性能的潜在类别分析见解。
Biostatistics. 2007 Apr;8(2):474-84. doi: 10.1093/biostatistics/kxl038. Epub 2006 Nov 3.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验