Lovasi Gina S, Fink David S, Mooney Stephen J, Link Bruce G
Drexel University, 3600 Market Street, Office 751, Philadelphia, PA, 19104, United States.
Columbia University, 722 West 168 Street, Room 724, New York, NY, 10032, United States.
SSM Popul Health. 2017 Dec;3:600-608. doi: 10.1016/j.ssmph.2017.07.005. Epub 2017 Jul 19.
Accounting for non-independence in health research often warrants attention. Particularly, the availability of geographic information systems data has increased the ease with which studies can add measures of the local "neighborhood" even if participant recruitment was through other contexts, such as schools or clinics. We highlight a tension between two perspectives that is often present, but particularly salient when more than one type of potentially health-relevant context is indexed (e.g., both neighborhood and school). On the one hand, a model-based perspective emphasizes the processes producing outcome variation, and observed data are used to make inference about that process. On the other hand, a design-based perspective emphasizes inference to a well-defined finite population, and is commonly invoked by those using complex survey samples or those with responsibility for the health of local residents. These two perspectives have divergent implications when deciding whether clustering must be accounted for analytically and how to select among candidate cluster definitions, though the perspectives are by no means monolithic. There are tensions within each perspective as well as between perspectives. We aim to provide insight into these perspectives and their implications for population health researchers. We focus on the crucial step of deciding which cluster definition or definitions to use at the analysis stage, as this has consequences for all subsequent analytic and interpretational challenges with potentially clustered data.
在健康研究中考虑非独立性往往值得关注。特别是,地理信息系统数据的可用性提高了研究添加当地“邻里”度量的便利性,即使参与者是通过其他途径招募的,如学校或诊所。我们强调了两种观点之间的一种紧张关系,这种关系经常存在,但当索引了不止一种潜在的与健康相关的背景时(例如,邻里和学校两者),就尤为突出。一方面,基于模型的观点强调产生结果变异的过程,观测数据用于对该过程进行推断。另一方面,基于设计的观点强调对明确界定的有限总体进行推断,通常被使用复杂抽样调查的人或负责当地居民健康的人所采用。在决定是否必须在分析中考虑聚类以及如何在候选聚类定义中进行选择时,这两种观点有不同的含义,尽管这些观点并非一成不变。每种观点内部以及不同观点之间都存在紧张关系。我们旨在深入了解这些观点及其对人群健康研究的影响。我们关注在分析阶段决定使用哪种聚类定义这一关键步骤,因为这会对所有后续与潜在聚类数据相关的分析和解释挑战产生影响。