Vounatsou P, Smith T, Gelfand A E
Swiss Tropical Institute, Socinstrasse 57, PO Box CH-4002 Basel, Switzerland.
Biostatistics. 2000 Jun;1(2):177-89. doi: 10.1093/biostatistics/1.2.177.
We develop hierarchical models for spatial multinomial data with missing categories, to analyse a database of HLA-A and -B gene and haplotype frequencies from Papua New Guinea, with a highly variable number of samples per spatial unit. The spatial structure of the multinomial data is incorporated by adopting conditional autoregressive (CAR) priors for the random effects, reflecting extra-multinomial variation. Different spatial structures are investigated, and covariate effects are evaluated using a novel model selection criterion. Tables and maps reveal strong spatial association and the importance of altitude, a covariate anticipated to be significant in explaining genetic variation. Our approach can be used in identifying associations with environmental factors, linguistic or epidemiological patterns and hence potential causes of genetic diversity (population movements, natural selection, stochastic effects).
我们为具有缺失类别的空间多项数据开发分层模型,以分析来自巴布亚新几内亚的HLA - A和 - B基因及单倍型频率数据库,每个空间单位的样本数量变化很大。通过对随机效应采用条件自回归(CAR)先验来纳入多项数据的空间结构,反映多项分布之外的变异。研究了不同的空间结构,并使用一种新颖的模型选择标准评估协变量效应。表格和地图揭示了强烈的空间关联以及海拔的重要性,海拔是预期在解释遗传变异方面具有显著意义的协变量。我们的方法可用于识别与环境因素、语言或流行病学模式的关联,从而找出遗传多样性的潜在原因(人口迁移、自然选择、随机效应)。