Wellcome Trust Centre for Human Genetics and Department of Clinical Pharmacology, University of Oxford, Oxford, UK.
Hum Mol Genet. 2011 Jul 15;20(14):2879-88. doi: 10.1093/hmg/ddr190. Epub 2011 Apr 29.
We have previously identified several colorectal cancer (CRC)-associated polymorphisms using genome-wide association (GWA) analysis. We sought to fine-map the location of the functional variants for three of these regions at 8q23.3 (EIF3H), 16q22.1 (CDH1/CDH3) and 19q13.11 (RHPN2). We genotyped two case-control sets at high density in the selected regions and used existing data from four other case-control sets, comprising a total of 9328 CRC cases and 10 480 controls. To improve marker density, we imputed genotypes from the 1000 Genomes Project and Hapmap3 data sets. All three regions contained smaller areas in which a cluster of single nucleotide polymorphisms (SNPs) showed clearly stronger association signals than surrounding SNPs, allowing us to assign those areas as the most likely location of the disease-associated functional variant. Further fine-mapping within those areas was generally unhelpful in identifying the functional variation based on strengths of association. However, functional annotation suggested a relatively small number of functional SNPs, including some with potential regulatory function at 8q23.3 and 16q22.1 and a non-synonymous SNP in RPHN2. Interestingly, the expression quantitative trait locus browser showed a number of highly associated SNP alleles correlated with mRNA expression levels not of EIF3H and CDH1 or CDH3, but of UTP23 and ZFP90, respectively. In contrast, none of the top SNPs within these regions was associated with transcript levels at EIF3H, CDH1 or CDH3. Our post-GWA study highlights benefits of fine-mapping of common disease variants in combination with publicly available data sets. In addition, caution should be exercised when assigning functionality to candidate genes in regions discovered through GWA analysis.
我们之前使用全基因组关联(GWA)分析发现了几个与结直肠癌(CRC)相关的多态性。我们试图对 8q23.3(EIF3H)、16q22.1(CDH1/CDH3)和 19q13.11(RHPN2)三个区域中的功能变异进行精细定位。我们在选定区域以高密度对两个病例对照集进行了基因分型,并使用来自另外四个病例对照集的现有数据,这些数据总计包含 9328 例 CRC 病例和 10480 例对照。为了提高标记密度,我们从 1000 基因组计划和 Hapmap3 数据集推断了基因型。所有三个区域都包含较小的区域,其中一群单核苷酸多态性(SNP)的关联信号明显强于周围的 SNP,使我们能够将这些区域指定为疾病相关功能变异的最可能位置。在这些区域内进一步精细定位通常无助于根据关联强度识别功能变异。然而,功能注释表明相对较少的功能 SNP,包括一些在 8q23.3 和 16q22.1 具有潜在调节功能的 SNP,以及在 RHPN2 中的一个非同义 SNP。有趣的是,表达数量性状基因座浏览器显示了许多与 mRNA 表达水平高度相关的高度关联 SNP 等位基因,这些 SNP 与 EIF3H 和 CDH1 或 CDH3 的表达水平不相关,而是与 UTP23 和 ZFP90 分别相关。相比之下,这些区域内的顶级 SNP 都与 EIF3H、CDH1 或 CDH3 的转录水平无关。我们的 GWAS 后研究强调了在结合公开数据集进行常见疾病变异精细定位的优势。此外,在将功能分配给通过 GWAS 分析发现的区域中的候选基因时应谨慎行事。