Coelho A V C, Moura R R, Cavalcanti C A J, Guimarães R L, Sandrin-Garcia P, Crovella S, Brandão L A C
Departamento de Genética, Universidade Federal de Pernambuco, Recife, PE, Brasil.
Departamento de Genética, Universidade Federal de Pernambuco, Recife, PE, Brasil
Genet Mol Res. 2015 Mar 31;14(1):2876-84. doi: 10.4238/2015.March.31.18.
Genetic association studies determine how genes influence traits. However, non-detected population substructure may bias the analysis, resulting in spurious results. One method to detect substructure is to genotype ancestry informative markers (AIMs) besides the candidate variants, quantifying how much ancestral populations contribute to the samples' genetic background. The present study aimed to use a minimum quantity of markers, while retaining full potential to estimate ancestries. We tested the feasibility of a subset of the 12 most informative markers from a previously established study to estimate influence from three ancestral populations: European, African and Amerindian. The results showed that in a sample with a diverse ethnicity (N = 822) derived from 1000 Genomes database, the 12 AIMs had the same capacity to estimate ancestries when compared to the original set of 128 AIMs, since estimates from the two panels were closely correlated. Thus, these 12 SNPs were used to estimate ancestry in a new sample (N = 192) from an admixed population in Recife, Northeast Brazil. The ancestry estimates from Recife subjects were in accordance with previous studies, showing that Northeastern Brazilian populations show great influence from European ancestry (59.7%), followed by African (23.0%) and Amerindian (17.3%) ancestries. Ethnicity self-classification according to skin-color was confirmed to be a poor indicator of population substructure in Brazilians, since ancestry estimates overlapped between classifications. Thus, our streamlined panel of 12 markers may substitute panels with more markers, while retaining the capacity to control for population substructure and admixture, thereby reducing sample processing time.
基因关联研究旨在确定基因如何影响性状。然而,未检测到的群体亚结构可能会使分析产生偏差,从而导致虚假结果。检测亚结构的一种方法是,除了对候选变异进行基因分型外,还对祖先信息标记(AIM)进行基因分型,以量化祖先群体对样本遗传背景的贡献程度。本研究旨在使用最少数量的标记,同时保留估计祖先的全部潜力。我们测试了从先前一项已确立的研究中选取的12个信息性最强的标记子集,以估计来自欧洲、非洲和美洲印第安这三个祖先群体的影响的可行性。结果表明,在一个来自千人基因组数据库的具有不同种族背景的样本(N = 822)中,与最初的128个AIM相比,这12个AIM在估计祖先方面具有相同的能力,因为两个面板的估计结果密切相关。因此,这12个单核苷酸多态性(SNP)被用于估计来自巴西东北部累西腓一个混合群体的新样本(N = 192)的祖先。累西腓受试者的祖先估计结果与先前的研究一致,表明巴西东北部人群受欧洲血统的影响很大(59.7%),其次是非洲血统(23.0%)和美洲印第安血统(17.3%)。根据肤色进行的种族自我分类被证实是巴西人群体亚结构的一个较差指标,因为不同分类之间的祖先估计结果存在重叠。因此,我们精简的12个标记面板可以替代标记更多的面板,同时保留控制群体亚结构和混合的能力,从而减少样本处理时间。