Zuccherato Luciana W, Schneider Silvana, Tarazona-Santos Eduardo, Hardwick Robert J, Berg Douglas E, Bogle Helen, Gouveia Mateus H, Machado Lee R, Machado Moara, Rodrigues-Soares Fernanda, Soares-Souza Giordano B, Togni Diego L, Zamudio Roxana, Gilman Robert H, Duarte Denise, Hollox Edward J, Rodrigues Maíra R
Departamento de Biologia Geral, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil.
Departamento de Estatística, Instituto de Ciências Exatas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil.
J R Soc Interface. 2017 Mar;14(128). doi: 10.1098/rsif.2017.0057.
While multiallelic copy number variation (mCNV) loci are a major component of genomic variation, quantifying the individual copy number of a locus and defining genotypes is challenging. Few methods exist to study how mCNV genetic diversity is apportioned within and between populations (i.e. to define the population genetic structure of mCNV). These inferences are critical in populations with a small effective size, such as Amerindians, that may not fit the Hardy-Weinberg model due to inbreeding, assortative mating, population subdivision, natural selection or a combination of these evolutionary factors. We propose a likelihood-based method that simultaneously infers mCNV allele frequencies and the population structure parameter , which quantifies the departure of homozygosity from the Hardy-Weinberg expectation. This method is implemented in the freely available software CNVice, which also infers individual genotypes using information from both the population and from trios, if available. We studied the population genetics of five immune-related mCNV loci associated with complex diseases (beta-defensins, , , and ) in 12 traditional Native American populations and found that the population structure parameters inferred for these mCNVs are comparable to but lower than those for single nucleotide polymorphisms studied in the same populations.
虽然多等位基因拷贝数变异(mCNV)位点是基因组变异的主要组成部分,但对一个位点的个体拷贝数进行量化并定义基因型具有挑战性。目前几乎没有方法可用于研究mCNV遗传多样性在群体内部和群体之间是如何分配的(即定义mCNV的群体遗传结构)。在有效规模较小的群体中,如美洲印第安人,由于近亲繁殖、选型交配、群体细分、自然选择或这些进化因素的组合,这些群体可能不符合哈迪-温伯格模型,因此这些推断至关重要。我们提出了一种基于似然性的方法,该方法能同时推断mCNV等位基因频率和群体结构参数,该参数量化了纯合性与哈迪-温伯格预期的偏离程度。此方法在免费软件CNVice中实现,该软件还会利用群体信息以及(如有)三联体信息来推断个体基因型。我们研究了12个传统美洲原住民群体中与复杂疾病相关的5个免疫相关mCNV位点(β-防御素、 、 、 和 )的群体遗传学,发现为这些mCNV推断出的群体结构参数与在相同群体中研究的单核苷酸多态性的参数相当,但低于后者。