Vallée Julie, Souris Marc, Fournet Florence, Bochaton Audrey, Mobillion Virginie, Peyronnie Karine, Salem Gérard
Conditions et Territoires d'Emergence des Maladies (UR178), Institut de Recherche pour le Développement (IRD), UR 178, PO, 5992, Vientiane, Laos.
Emerg Themes Epidemiol. 2007 Jun 1;4:6. doi: 10.1186/1742-7622-4-6.
Geographical objectives and probabilistic methods are difficult to reconcile in a unique health survey. Probabilistic methods focus on individuals to provide estimates of a variable's prevalence with a certain precision, while geographical approaches emphasise the selection of specific areas to study interactions between spatial characteristics and health outcomes. A sample selected from a small number of specific areas creates statistical challenges: the observations are not independent at the local level, and this results in poor statistical validity at the global level. Therefore, it is difficult to construct a sample that is appropriate for both geographical and probability methods.
We used a two-stage selection procedure with a first non-random stage of selection of clusters. Instead of randomly selecting clusters, we deliberately chose a group of clusters, which as a whole would contain all the variation in health measures in the population. As there was no health information available before the survey, we selected a priori determinants that can influence the spatial homogeneity of the health characteristics. This method yields a distribution of variables in the sample that closely resembles that in the overall population, something that cannot be guaranteed with randomly-selected clusters, especially if the number of selected clusters is small. In this way, we were able to survey specific areas while minimising design effects and maximising statistical precision.
We applied this strategy in a health survey carried out in Vientiane, Lao People's Democratic Republic. We selected well-known health determinants with unequal spatial distribution within the city: nationality and literacy. We deliberately selected a combination of clusters whose distribution of nationality and literacy is similar to the distribution in the general population.
This paper describes the conceptual reasoning behind the construction of the survey sample and shows that it can be advantageous to choose clusters using reasoned hypotheses, based on both probability and geographical approaches, in contrast to a conventional, random cluster selection strategy.
在单一的健康调查中,地理目标和概率方法难以协调。概率方法关注个体,以一定精度提供变量患病率的估计值,而地理方法则强调选择特定区域来研究空间特征与健康结果之间的相互作用。从少数特定区域选取样本会带来统计挑战:在局部层面观测值并非独立,这导致在全局层面统计效度不佳。因此,构建一个适用于地理和概率方法的样本很困难。
我们采用了两阶段选择程序,第一阶段是非随机选择聚类。我们并非随机选择聚类,而是特意选择一组聚类,作为一个整体,它将包含总体健康指标中的所有变化。由于在调查前没有健康信息可用,我们选择了能影响健康特征空间同质性的先验决定因素。这种方法使样本中的变量分布与总体中的变量分布非常相似,而随机选择聚类则无法保证这一点,尤其是当所选聚类数量较少时。通过这种方式,我们能够在最小化设计效应并最大化统计精度的同时对特定区域进行调查。
我们在老挝人民民主共和国万象市开展的一项健康调查中应用了这一策略。我们选择了市内空间分布不均的知名健康决定因素:国籍和识字率。我们特意选择了一组聚类,其国籍和识字率的分布与总体分布相似。
本文描述了调查样本构建背后的概念推理,并表明与传统的随机聚类选择策略相比,基于概率和地理方法,使用合理假设选择聚类可能具有优势。