Suppr超能文献

通过从有限总体构建抽样分布来研究生态谬误。

Investigating the ecological fallacy through sampling distributions constructed from finite populations.

作者信息

Torres David J, Rouson Damain

机构信息

Department of Mathematics and Physical Science, Northern New Mexico College, Española, NM 87532, USA.

Computer Languages and Systems Software Group, Lawrence Berkeley National Laboratory, Berkeley, California, USA.

出版信息

Monte Carlo Methods Appl. 2024 Aug 8;30(4):331-363. doi: 10.1515/mcma-2024-2013. eCollection 2024 Dec.

Abstract

Correlation coefficients and linear regression values computed from group averages can differ from correlation coefficients and linear regression values computed using individual scores. This observation known as the ecological fallacy often assumes that all the individual scores are available from a population. In many situations, one must use a sample from the larger population. In such cases, the computed correlation coefficient and linear regression values will depend on the sample that is chosen and the underlying sampling distribution. The sampling distribution of correlation coefficients and linear regression values for group averages will be identical to the sampling distribution for individuals for normally distributed variables for random samples drawn from infinitely large continuous distributions. However, data that is acquired in practice is often acquired when sampling without replacement from a finite population. Our objective is to demonstrate through Monte Carlo simulations that the sampling distributions for correlation and linear regression will also be similar for individuals and group averages when sampling without replacement from normally distributed variables. These simulations suggest that when a random sample from a population is selected, the correlation coefficients and linear regression values computed from individual scores will not be more accurate in estimating the entire population values compared to samples when group averages are used as long as the sample size is the same.

摘要

根据组均值计算出的相关系数和线性回归值可能与使用个体分数计算出的相关系数和线性回归值有所不同。这种被称为生态谬误的观察结果通常假定可以从总体中获取所有个体分数。在许多情况下,人们必须使用来自更大总体的样本。在这种情况下,计算出的相关系数和线性回归值将取决于所选择的样本以及潜在的抽样分布。对于从无限大的连续分布中抽取的随机样本,对于正态分布变量,组均值的相关系数和线性回归值的抽样分布将与个体的抽样分布相同。然而,在实际中获取的数据通常是在从有限总体中无放回抽样时获得的。我们的目标是通过蒙特卡罗模拟证明,当从正态分布变量中无放回抽样时,个体和组均值的相关和线性回归的抽样分布也将相似。这些模拟表明,当从总体中选择一个随机样本时,只要样本量相同,与使用组均值的样本相比,根据个体分数计算出的相关系数和线性回归值在估计总体值时并不更准确。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ce2/11748512/135dbdfd969f/j_mcma-2024-2013_fig_0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验