Gao Chuan, Hsu Fang-Chi, Dimitrov Latchezar M, Okut Hayrettin, Chen Yii-Der I, Taylor Kent D, Rotter Jerome I, Langefeld Carl D, Bowden Donald W, Palmer Nicholette D
Molecular Genetics and Genomics Program, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America.
Center for Genomics and Personalized Medicine Research, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America.
Genet Epidemiol. 2017 May;41(4):353-362. doi: 10.1002/gepi.22042. Epub 2017 Apr 5.
Insertions and deletions (INDELs) represent a significant fraction of interindividual variation in the human genome yet their contribution to phenotypes is poorly understood. To confirm the quality of imputed INDELs and investigate their roles in mediating cardiometabolic phenotypes, genome-wide association and linkage analyses were performed for 15 phenotypes with 1,273,952 imputed INDELs in 1,024 Mexican-origin Americans. Imputation quality was validated using whole exome sequencing with an average kappa of 0.93 in common INDELs (minor allele frequencies [MAFs] ≥ 5%). Association analysis revealed one genome-wide significant association signal for the cholesterylester transfer protein gene (CETP) with high-density lipoprotein levels (rs36229491, P = 3.06 × 10 ); linkage analysis identified two peaks with logarithm of the odds (LOD) > 5 (rs60560566, LOD = 5.36 with insulin sensitivity (S ) and rs5825825, LOD = 5.11 with adiponectin levels). Suggestive overlapping signals between linkage and association were observed: rs59849892 in the WSC domain containing 2 gene (WSCD2) was associated and nominally linked with S (P = 1.17 × 10 , LOD = 1.99). This gene has been implicated in glucose metabolism in human islet cell expression studies. In addition, rs201606363 was linked and nominally associated with low-density lipoprotein (P = 4.73 × 10 , LOD = 3.67), apolipoprotein B (P = 1.39 × 10 , LOD = 4.64), and total cholesterol (P = 1.35 × 10 , LOD = 3.80) levels. rs201606363 is an intronic variant of the UBE2F-SCLY (where UBE2F is ubiquitin-conjugating enzyme E2F and SCLY is selenocysteine lyase) fusion gene that may regulate cholesterol through selenium metabolism. In conclusion, these results confirm the feasibility of imputing INDELs from array-based single nucleotide polymorphism (SNP) genotypes. Analysis of these variants using association and linkage replicated previously identified SNP signals and identified multiple novel INDEL signals. These results support the inclusion of INDELs into genetic studies to more fully interrogate the spectrum of genetic variation.
插入和缺失(INDELs)在人类基因组的个体间变异中占相当大的比例,但其对表型的贡献却知之甚少。为了确认推算出的INDELs的质量并研究它们在介导心脏代谢表型中的作用,我们对1024名墨西哥裔美国人的15种表型和1273952个推算出的INDELs进行了全基因组关联和连锁分析。使用全外显子测序验证了推算质量,常见INDELs(次要等位基因频率[MAFs]≥5%)的平均kappa值为0.93。关联分析揭示了胆固醇酯转运蛋白基因(CETP)与高密度脂蛋白水平之间存在一个全基因组显著关联信号(rs36229491,P = 3.06 × 10 );连锁分析确定了两个对数优势(LOD)>5的峰值(rs60560566,与胰岛素敏感性[S ]的LOD = 5.36,以及rs5825825,与脂联素水平的LOD = 5.11)。观察到连锁和关联之间存在提示性的重叠信号:含2基因(WSCD2)的WSC结构域中的rs59849892与S 相关且名义上连锁(P = 1.17 × 10 ,LOD = 1.99)。在人类胰岛细胞表达研究中,该基因与葡萄糖代谢有关。此外,rs201606363与低密度脂蛋白(P = 4.73 × 10 ,LOD = 3.67)、载脂蛋白B(P = 1.39 × 10 ,LOD = 4.64)和总胆固醇(P = 1.35 × 10 ,LOD = 3.80)水平连锁且名义上相关。rs201606363是UBE2F - SCLY(其中UBE2F是泛素结合酶E2F,SCLY是硒代半胱氨酸裂解酶)融合基因的内含子变异体,可能通过硒代谢调节胆固醇。总之,这些结果证实了从基于阵列的单核苷酸多态性(SNP)基因型推算INDELs的可行性。使用关联和连锁分析这些变异体,重复了先前确定的SNP信号并确定了多个新的INDEL信号。这些结果支持将INDELs纳入遗传研究,以更全面地探究遗传变异谱。