• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

序尺度对高斯混合恢复的影响。

The impact of ordinal scales on Gaussian mixture recovery.

机构信息

Psychological Methods Group, University of Amsterdam, Amsterdam, Netherlands.

Department of Methodology and Statistics, Tilburg University, Tilburg, Netherlands.

出版信息

Behav Res Methods. 2023 Jun;55(4):2143-2156. doi: 10.3758/s13428-022-01883-8. Epub 2022 Jul 13.

DOI:10.3758/s13428-022-01883-8
PMID:35831565
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10250525/
Abstract

Gaussian mixture models (GMMs) are a popular and versatile tool for exploring heterogeneity in multivariate continuous data. Arguably the most popular way to estimate GMMs is via the expectation-maximization (EM) algorithm combined with model selection using the Bayesian information criterion (BIC). If the GMM is correctly specified, this estimation procedure has been demonstrated to have high recovery performance. However, in many situations, the data are not continuous but ordinal, for example when assessing symptom severity in medical data or modeling the responses in a survey. For such situations, it is unknown how well the EM algorithm and the BIC perform in GMM recovery. In the present paper, we investigate this question by simulating data from various GMMs, thresholding them in ordinal categories and evaluating recovery performance. We show that the number of components can be estimated reliably if the number of ordinal categories and the number of variables is high enough. However, the estimates of the parameters of the component models are biased independent of sample size. Finally, we discuss alternative modeling approaches which might be adopted for the situations in which estimating a GMM is not acceptable.

摘要

高斯混合模型(GMM)是探索多元连续数据异质性的一种流行且通用的工具。可以说,估计 GMM 最流行的方法是通过期望最大化(EM)算法结合贝叶斯信息准则(BIC)进行模型选择。如果 GMM 得到正确指定,那么这种估计过程具有很高的恢复性能。然而,在许多情况下,数据不是连续的,而是有序的,例如在评估医疗数据中的症状严重程度或对调查中的反应进行建模时。对于这种情况,尚不清楚 EM 算法和 BIC 在 GMM 恢复中的性能如何。在本文中,我们通过模拟来自各种 GMM 的数据来研究这个问题,将它们在有序类别中进行阈值处理,并评估恢复性能。我们表明,如果有序类别和变量的数量足够高,则可以可靠地估计组件的数量。但是,无论样本量如何,组件模型的参数估计都是有偏的。最后,我们讨论了在不能接受估计 GMM 的情况下可能采用的替代建模方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa00/10250525/67e09100b2a7/13428_2022_1883_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa00/10250525/108b2e924e1d/13428_2022_1883_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa00/10250525/31f1051b5433/13428_2022_1883_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa00/10250525/47bc479206d3/13428_2022_1883_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa00/10250525/048a41079e92/13428_2022_1883_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa00/10250525/ee7efae61371/13428_2022_1883_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa00/10250525/50e92a1da3ef/13428_2022_1883_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa00/10250525/c2effbeb99b0/13428_2022_1883_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa00/10250525/0053acc4a2e6/13428_2022_1883_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa00/10250525/90860d03c940/13428_2022_1883_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa00/10250525/cc3cba95032a/13428_2022_1883_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa00/10250525/6e06f691411d/13428_2022_1883_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa00/10250525/ccd069e428d3/13428_2022_1883_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa00/10250525/67e09100b2a7/13428_2022_1883_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa00/10250525/108b2e924e1d/13428_2022_1883_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa00/10250525/31f1051b5433/13428_2022_1883_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa00/10250525/47bc479206d3/13428_2022_1883_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa00/10250525/048a41079e92/13428_2022_1883_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa00/10250525/ee7efae61371/13428_2022_1883_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa00/10250525/50e92a1da3ef/13428_2022_1883_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa00/10250525/c2effbeb99b0/13428_2022_1883_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa00/10250525/0053acc4a2e6/13428_2022_1883_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa00/10250525/90860d03c940/13428_2022_1883_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa00/10250525/cc3cba95032a/13428_2022_1883_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa00/10250525/6e06f691411d/13428_2022_1883_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa00/10250525/ccd069e428d3/13428_2022_1883_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa00/10250525/67e09100b2a7/13428_2022_1883_Fig13_HTML.jpg

相似文献

1
The impact of ordinal scales on Gaussian mixture recovery.序尺度对高斯混合恢复的影响。
Behav Res Methods. 2023 Jun;55(4):2143-2156. doi: 10.3758/s13428-022-01883-8. Epub 2022 Jul 13.
2
A joint finite mixture model for clustering genes from independent Gaussian and beta distributed data.一种用于对来自独立高斯分布和贝塔分布数据的基因进行聚类的联合有限混合模型。
BMC Bioinformatics. 2009 May 29;10:165. doi: 10.1186/1471-2105-10-165.
3
Multisource single-cell data integration by MAW barycenter for Gaussian mixture models.基于 MAW 质心的高斯混合模型进行多源单细胞数据整合。
Biometrics. 2023 Jun;79(2):866-877. doi: 10.1111/biom.13630. Epub 2022 Mar 15.
4
Examining the effect of initialization strategies on the performance of Gaussian mixture modeling.研究初始化策略对高斯混合建模性能的影响。
Behav Res Methods. 2017 Feb;49(1):282-293. doi: 10.3758/s13428-015-0697-6.
5
Hybrid genetic and variational expectation-maximization algorithm for gaussian-mixture-model-based brain MR image segmentation.基于高斯混合模型的脑磁共振图像分割的混合遗传与变分期望最大化算法
IEEE Trans Inf Technol Biomed. 2011 May;15(3):373-80. doi: 10.1109/TITB.2011.2106135. Epub 2011 Jan 13.
6
Mixed Bayesian networks: a mixture of Gaussian distributions.混合贝叶斯网络:高斯分布的混合。
Methods Inf Med. 1994 Dec;33(5):535-42.
7
Fast genomic predictions via Bayesian G-BLUP and multilocus models of threshold traits including censored Gaussian data.基于贝叶斯 G-BLUP 和阈性状的多基因模型的快速基因组预测,包括截尾正态数据。
G3 (Bethesda). 2013 Sep 4;3(9):1511-23. doi: 10.1534/g3.113.007096.
8
Manifold regularized semi-supervised Gaussian mixture model.流形正则化半监督高斯混合模型
J Opt Soc Am A Opt Image Sci Vis. 2015 Apr 1;32(4):566-75. doi: 10.1364/JOSAA.32.000566.
9
Gaussian-input Gaussian mixture model for representing density maps and atomic models.用于表示密度图和原子模型的高斯输入高斯混合模型。
J Struct Biol. 2018 Jul;203(1):1-16. doi: 10.1016/j.jsb.2018.03.002. Epub 2018 Mar 6.
10
A GMM-IG framework for selecting genes as expression panel biomarkers.一种用于选择基因作为表达谱生物标志物的 GMM-IG 框架。
Artif Intell Med. 2010 Feb-Mar;48(2-3):75-82. doi: 10.1016/j.artmed.2009.07.006. Epub 2009 Dec 8.

本文引用的文献

1
Psychological networks in clinical populations: investigating the consequences of Berkson's bias.临床人群中的心理网络:探讨伯克森偏倚的后果。
Psychol Med. 2021 Jan;51(1):168-176. doi: 10.1017/S0033291719003209. Epub 2019 Dec 4.
2
On Ising models and algorithms for the construction of symptom networks in psychopathological research.关于 Ising 模型和算法在精神病理学研究中构建症状网络的研究。
Psychol Methods. 2019 Dec;24(6):735-753. doi: 10.1037/met0000207. Epub 2019 Oct 7.
3
General mixture item response models with different item response structures: Exposition with an application to Likert scales.
具有不同项目反应结构的通用混合项目反应模型:应用于李克特量表的阐述。
Behav Res Methods. 2018 Dec;50(6):2325-2344. doi: 10.3758/s13428-017-0997-0.
4
A network theory of mental disorders.精神障碍的网络理论。
World Psychiatry. 2017 Feb;16(1):5-13. doi: 10.1002/wps.20375.
5
mclust 5: Clustering, Classification and Density Estimation Using Gaussian Finite Mixture Models.mclust 5:使用高斯有限混合模型进行聚类、分类和密度估计
R J. 2016 Aug;8(1):289-317.
6
Graphical Models for Ordinal Data.有序数据的图形模型。
J Comput Graph Stat. 2015 Mar 31;24(1):183-204. doi: 10.1080/10618600.2014.889023.