Garvan Institute of Medical Research, Sydney, NSW, Australia.
Faculty of Medicine, St Vincent's Clinical School, University of New South Wales, Sydney, NSW, Australia.
Genome Biol. 2017 Dec 28;18(1):241. doi: 10.1186/s13059-017-1363-3.
Genotyping of large populations through genome-wide association studies (GWAS) has successfully identified many genomic variants associated with traits or disease risk. Unexpectedly, a large proportion of GWAS single nucleotide polymorphisms (SNPs) and associated haplotype blocks are in intronic and intergenic regions, hindering their functional evaluation. While some of these risk-susceptibility regions encompass cis-regulatory sites, their transcriptional potential has never been systematically explored.
To detect rare tissue-specific expression, we employed the transcript-enrichment method CaptureSeq on 21 human tissues to identify 1775 multi-exonic transcripts from 561 intronic and intergenic haploblocks associated with 392 traits and diseases, covering 73.9 Mb (2.2%) of the human genome. We show that a large proportion (85%) of disease-associated haploblocks express novel multi-exonic non-coding transcripts that are tissue-specific and enriched for GWAS SNPs as well as epigenetic markers of active transcription and enhancer activity. Similarly, we captured transcriptomes from 13 melanomas, targeting nine melanoma-associated haploblocks, and characterized 31 novel melanoma-specific transcripts that include fusion proteins, novel exons and non-coding RNAs, one-third of which showed allelically imbalanced expression.
This resource of previously unreported transcripts in disease-associated regions ( http://gwas-captureseq.dingerlab.org ) should provide an important starting point for the translational community in search of novel biomarkers, disease mechanisms, and drug targets.
通过全基因组关联研究(GWAS)对大人群进行基因分型已成功鉴定出许多与性状或疾病风险相关的基因组变异。出乎意料的是,GWAS 单核苷酸多态性(SNP)和相关单倍型块的很大一部分位于内含子和基因间区域,阻碍了它们的功能评估。虽然这些风险易感性区域中的一些包含顺式调控位点,但它们的转录潜力从未被系统地探索过。
为了检测罕见的组织特异性表达,我们在 21 个人体组织中采用转录物富集方法 CaptureSeq 来鉴定 561 个与 392 种性状和疾病相关的内含子和基因间单倍型块中的 1775 个多外显子转录本,涵盖人类基因组的 73.9Mb(2.2%)。我们表明,很大一部分(85%)与疾病相关的单倍型块表达新型的多外显子非编码转录本,这些转录本具有组织特异性,并且富含 GWAS SNPs 以及活跃转录和增强子活性的表观遗传标记。同样,我们从 13 个黑色素瘤中捕获转录组,靶向 9 个与黑色素瘤相关的单倍型块,并对 31 个新的黑色素瘤特异性转录本进行了特征描述,其中包括融合蛋白、新外显子和非编码 RNA,其中三分之一表现出等位基因失衡表达。
该资源(http://gwas-captureseq.dingerlab.org)中包含以前未报道的疾病相关区域中的转录本,应该为翻译社区寻找新的生物标志物、疾病机制和药物靶点提供一个重要的起点。