Wallace Amelia D, Wendt George A, Barcellos Lisa F, de Smith Adam J, Walsh Kyle M, Metayer Catherine, Costello Joseph F, Wiemels Joseph L, Francis Stephen S
Division of Epidemiology, School of Public Health, University of California, Berkeley, Berkeley, CA, United States.
Division of Epidemiology, School of Community Health Sciences, University of Nevada, Reno, NV, United States.
Front Genet. 2018 Aug 14;9:298. doi: 10.3389/fgene.2018.00298. eCollection 2018.
Approximately 8% of the human genome is comprised of endogenous retroviral insertions (ERVs) originating from historic retroviral integration into germ cells. The function of ERVs as regulators of gene expression is well established. Less well studied are insertional polymorphisms of ERVs and their contribution to the heritability of complex phenotypes. The most recent integration of ERV, HERV-K, is expressed in a range of complex human conditions from cancer to neurologic diseases. Using an in-house computational pipeline and whole-genome sequencing data from the diverse 1,000 Genomes Phase 3 population ( = 2,504), we identified 46 polymorphic HERV-K insertions that are tagged by adjacent single nucleotide polymorphisms (SNPs). To test the potential role of polymorphic HERV-K in the heritability of complex diseases, existing databases were queried for enrichment of established relationships between the HERV-K insertion-associated SNPs (hiSNPs), and tissue specific gene expression and disease phenotypes. Overall, hiSNPs for the 46 polymorphic HERV-K sites were statistically enriched ( < 1.0E) for eQTLs across 44 human tissues. Fifteen of the 46 HERV-K insertions had hiSNPs annotated in the EMBL-EBI GWAS Catalog and cumulatively associated with >100 phenotypes. Experimental factor ontology enrichment analysis suggests that polymorphic HERV-K specifically contribute to neurologic and immunologic disease phenotypes, including traits related to intra cranial volume (FDR 2.00E-09), Parkinson's disease (FDR 1.80E-09), and autoimmune diseases (FDR 1.80E-09). These results provide strong candidates for context-specific study of polymorphic HERV-K insertions in disease-related traits, serving as a roadmap for future studies of the heritability of complex disease.
大约8%的人类基因组由内源性逆转录病毒插入序列(ERVs)组成,这些序列源于历史上逆转录病毒整合到生殖细胞中。ERVs作为基因表达调节因子的功能已得到充分证实。对ERVs的插入多态性及其对复杂表型遗传性的贡献的研究较少。最新整合的ERV,即人类内源性逆转录病毒K型(HERV-K),在从癌症到神经疾病等一系列复杂的人类疾病中都有表达。利用内部计算流程和来自千人基因组计划第三阶段多样化人群(n = 2504)的全基因组测序数据,我们鉴定出46个多态性HERV-K插入序列,这些序列由相邻的单核苷酸多态性(SNP)标记。为了测试多态性HERV-K在复杂疾病遗传性中的潜在作用,我们查询了现有数据库,以了解HERV-K插入相关SNP(hiSNP)与组织特异性基因表达和疾病表型之间已建立关系的富集情况。总体而言,46个多态性HERV-K位点的hiSNP在44种人类组织的表达数量性状基因座(eQTL)中具有统计学意义的富集(P < 1.0E)。46个HERV-K插入序列中有15个的hiSNP在欧洲分子生物学实验室-欧洲生物信息研究所(EMBL-EBI)全基因组关联研究(GWAS)目录中有注释,并且累计与超过100种表型相关。实验因子本体富集分析表明,多态性HERV-K特别有助于神经和免疫疾病表型,包括与颅内体积相关的性状(错误发现率2.00E - 09)、帕金森病(错误发现率1.80E - 09)和自身免疫性疾病(错误发现率1.80E - 09)。这些结果为在疾病相关性状中对多态性HERV-K插入进行特定背景研究提供了强有力的候选对象,为未来复杂疾病遗传性研究提供了路线图。