Gupta Hemant, Chandratre Khyati, Sinha Siddharth, Huang Teng, Wu Xiaobing, Cui Jian, Zhang Michael Q, Wang San Ming
Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau, SAR, China.
Eppley Institute for Cancer Research, University of Nebraska Medical Center, Omaha, NE, 68198, USA.
BMC Genomics. 2020 Nov 30;21(1):842. doi: 10.1186/s12864-020-07222-5.
Core promoter controls transcription initiation. However, little is known for core promoter diversity in the human genome and its relationship with diseases. We hypothesized that as a functional important component in the genome, the core promoter in the human genome could be under evolutionary selection, as reflected by its highly diversification in order to adjust gene expression for better adaptation to the different environment.
Applying the "Exome-based Variant Detection in Core-promoters" method, we analyzed human core-promoter diversity by using the 2682 exome data sets of 25 worldwide human populations sequenced by the 1000 Genome Project. Collectively, we identified 31,996 variants in the core promoter region (- 100 to + 100) of 12,509 human genes ( https://dbhcpd.fhs.um.edu.mo ). Analyzing the rich variation data identified highly ethnic-specific patterns of core promoter variation between different ethnic populations, the genes with highly variable core promoters, the motifs affected by the variants, and their involved functional pathways. eQTL test revealed that 12% of core promoter variants can significantly alter gene expression level. Comparison with GWAS data we located 163 variants as the GWAS identified traits associated with multiple diseases, half of these variants can alter gene expression.
Data from our study reals the highly diversified nature of core promoter in the human genome, and highlights that core promoter variation could play important roles not only in gene expression regulation but also in disease predisposition.
核心启动子控制转录起始。然而,对于人类基因组中核心启动子的多样性及其与疾病的关系知之甚少。我们推测,作为基因组中功能重要的组成部分,人类基因组中的核心启动子可能处于进化选择之下,这表现为其高度多样化,以便调节基因表达以更好地适应不同环境。
应用“基于外显子组的核心启动子变异检测”方法,我们利用千人基因组计划测序的25个全球人类群体的2682个外显子组数据集分析了人类核心启动子的多样性。我们总共在12509个人类基因的核心启动子区域(-100至+100)中鉴定出31996个变异(https://dbhcpd.fhs.um.edu.mo)。分析丰富的变异数据,确定了不同种族群体之间核心启动子变异的高度种族特异性模式、核心启动子高度可变的基因、受变异影响的基序及其涉及的功能途径。eQTL测试表明,12%的核心启动子变异可显著改变基因表达水平。与GWAS数据比较,我们定位了163个变异,这些变异是GWAS确定的与多种疾病相关的性状,其中一半变异可改变基因表达。
我们的研究数据揭示了人类基因组中核心启动子的高度多样化性质,并强调核心启动子变异不仅在基因表达调控中而且在疾病易感性中可能发挥重要作用。