Mychaleckyj Josyf C, Havt Alexandre, Nayak Uma, Pinkerton Relana, Farber Emily, Concannon Patrick, Lima Aldo A, Guerrant Richard L
Center for Public Health Genomics, University of Virginia, Charlottesville, VA.
Department of Public Health Sciences, University of Virginia, Charlottesville, VA.
Mol Biol Evol. 2017 Mar 1;34(3):559-574. doi: 10.1093/molbev/msw249.
Despite its population, geographic size, and emerging economic importance, disproportionately little genome-scale research exists into genetic factors that predispose Brazilians to disease, or the population genetics of risk. After identification of suitable proxy populations and careful analysis of tri-continental admixture in 1,538 North-Eastern Brazilians to estimate individual ancestry and ancestral allele frequencies, we computed 400,000 genome-wide locus-specific branch length (LSBL) Fst statistics of Brazilian Amerindian ancestry compared to European and African; and a similar set of differentiation statistics for their Amerindian component compared with the closest Asian 1000 Genomes population (surprisingly, Bengalis in Bangladesh). After ranking SNPs by these statistics, we identified the top 10 highly differentiated SNPs in five genome regions in the LSBL tests of Brazilian Amerindian ancestry compared to European and African; and the top 10 SNPs in eight regions comparing their Amerindian component to the closest Asian 1000 Genomes population. We found SNPs within or proximal to the genes CIITA (rs6498115), SMC6 (rs1834619), and KLHL29 (rs2288697) were most differentiated in the Amerindian-specific branch, while SNPs in the genes ADAMTS9 (rs7631391), DOCK2 (rs77594147), SLC28A1 (rs28649017), ARHGAP5 (rs7151991), and CIITA (rs45601437) were most highly differentiated in the Asian comparison. These genes are known to influence immune function, metabolic and anthropometry traits, and embryonic development. These analyses have identified candidate genes for selection within Amerindian ancestry, and by comparison of the two analyses, those for which the differentiation may have arisen during the migration from Asia to the Americas.
尽管巴西人口众多、地域辽阔且经济重要性日益凸显,但针对巴西人易患疾病的遗传因素或风险人群遗传学的全基因组规模研究却相对较少。在确定了合适的替代人群,并对1538名巴西东北部人群的三大洲混合情况进行仔细分析以估计个体祖先和祖先等位基因频率后,我们计算了40万个全基因组位点特异性分支长度(LSBL)Fst统计量,用于比较巴西美洲印第安人血统与欧洲和非洲血统;并针对其美洲印第安人成分与最接近的亚洲千人基因组人群(令人惊讶的是,孟加拉国的孟加拉人)计算了一组类似的分化统计量。通过这些统计量对单核苷酸多态性(SNP)进行排序后,我们在巴西美洲印第安人血统与欧洲和非洲血统的LSBL测试中,确定了五个基因组区域中排名前十的高度分化SNP;在将其美洲印第安人成分与最接近的亚洲千人基因组人群进行比较时,确定了八个区域中排名前十的SNP。我们发现,基因CIITA(rs6498115)、SMC6(rs1834619)和KLHL29(rs2288697)内部或附近的SNP在美洲印第安人特异性分支中差异最大,而基因ADAMTS-9(rs7631391)、DOCK2(rs77594147)、SLC28A1(rs28649017)、ARHGAP5(rs7151991)和CIITA(rs45601437)中的SNP在与亚洲人群比较时差异最大。已知这些基因会影响免疫功能、代谢和人体测量特征以及胚胎发育。这些分析确定了美洲印第安人血统中进行选择的候选基因,通过比较这两项分析,还确定了那些在从亚洲迁移到美洲的过程中可能产生分化的基因。