Cancer Institute (Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education), Second Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou, Zhejiang Province, China.
PLoS One. 2013 Aug 5;8(8):e70307. doi: 10.1371/journal.pone.0070307. Print 2013.
Recent studies have demonstrated the power of deep re-sequencing of the whole genome or exome in understanding cancer genomes. However, targeted capture of selected genomic whole gene-body regions, rather than the whole exome, have several advantages: 1) the genes can be selected based on biology or a hypothesis; 2) mutations in promoter and intronic regions, which have important regulatory roles, can be investigated; and 3) less expensive than whole genome or whole exome sequencing. Therefore, we designed custom high-density oligonucleotide microarrays (NimbleGen Inc.) to capture approximately 1.7 Mb target regions comprising the genomic regions of 28 genes related to colorectal cancer including genes belonging to the WNT signaling pathway, as well as important transcription factors or colon-specific genes that are over expressed in colorectal cancer (CRC). The 1.7 Mb targeted regions were sequenced with a coverage ranged from 32× to 45× for the 28 genes. We identified a total of 2342 sequence variations in the CRC and corresponding adjacent normal tissues. Among them, 738 were novel sequence variations based on comparisons with the SNP database (dbSNP135). We validated 56 of 66 SNPs in a separate cohort of 30 CRC tissues using Sequenom MassARRAY iPLEX Platform, suggesting a validation rate of at least 85% (56/66). We found 15 missense mutations among the exonic variations, 21 synonymous SNPs that were predicted to change the exonic splicing motifs, 31 UTR SNPs that were predicted to occur at the transcription factor binding sites, 20 intronic SNPs located near the splicing sites, 43 SNPs in conserved transcription factor binding sites and 32 in CpG islands. Finally, we determined that rs3106189, localized to the 5' UTR of antigen presenting tapasin binding protein (TAPBP), and rs1052918, localized to the 3' UTR of transcription factor 3 (TCF3), were associated with overall survival of CRC patients.
最近的研究表明,通过深度重测序整个基因组或外显子来了解癌症基因组具有强大的功能。然而,相对于整个外显子,靶向捕获选定的基因组全基因体区域具有以下几个优势:1)可以根据生物学或假说选择基因;2)可以研究具有重要调控作用的启动子和内含子区域的突变;3)比全基因组或全外显子测序更经济。因此,我们设计了定制的高密度寡核苷酸微阵列(NimbleGen Inc.),以捕获大约 1.7 Mb 的目标区域,其中包括与结直肠癌相关的 28 个基因的基因组区域,这些基因属于 WNT 信号通路,以及在结直肠癌(CRC)中过度表达的重要转录因子或结肠特异性基因。用覆盖度在 32×至 45×之间的方法对 28 个基因的 1.7 Mb 靶向区域进行测序。我们在 CRC 和相应的相邻正常组织中共鉴定出 2342 种序列变异。其中,基于与 SNP 数据库(dbSNP135)的比较,有 738 种是新的序列变异。我们在 30 例 CRC 组织的另一个队列中使用Sequenom MassARRAY iPLEX 平台验证了 66 个 SNP 中的 56 个,这表明验证率至少为 85%(56/66)。我们在外显子变异中发现了 15 个错义突变,21 个预测会改变外显子剪接基序的同义 SNP,31 个预测发生在转录因子结合位点的 UTR SNP,20 个位于剪接位点附近的内含子 SNP,43 个 SNP 位于保守的转录因子结合位点,32 个位于 CpG 岛。最后,我们确定位于抗原呈递 tapasin 结合蛋白(TAPBP)5'UTR 的 rs3106189 和位于转录因子 3(TCF3)3'UTR 的 rs1052918 与 CRC 患者的总生存时间有关。