Wang Min, Beck Christine R, English Adam C, Meng Qingchang, Buhay Christian, Han Yi, Doddapaneni Harsha V, Yu Fuli, Boerwinkle Eric, Lupski James R, Muzny Donna M, Gibbs Richard A
Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA.
Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA.
BMC Genomics. 2015 Mar 19;16(1):214. doi: 10.1186/s12864-015-1370-2.
Generation of long (>5 Kb) DNA sequencing reads provides an approach for interrogation of complex regions in the human genome. Currently, large-insert whole genome sequencing (WGS) technologies from Pacific Biosciences (PacBio) enable analysis of chromosomal structural variations (SVs), but the cost to achieve the required sequence coverage across the entire human genome is high.
We developed a method (termed PacBio-LITS) that combines oligonucleotide-based DNA target-capture enrichment technologies with PacBio large-insert library preparation to facilitate SV studies at specific chromosomal regions. PacBio-LITS provides deep sequence coverage at the specified sites at substantially reduced cost compared with PacBio WGS. The efficacy of PacBio-LITS is illustrated by delineating the breakpoint junctions of low copy repeat (LCR)-associated complex structural rearrangements on chr17p11.2 in patients diagnosed with Potocki-Lupski syndrome (PTLS; MIM#610883). We successfully identified previously determined breakpoint junctions in three PTLS cases, and also were able to discover novel junctions in repetitive sequences, including LCR-mediated breakpoints. The new information has enabled us to propose mechanisms for formation of these structural variants.
The new method leverages the cost efficiency of targeted capture-sequencing as well as the mappability and scaffolding capabilities of long sequencing reads generated by the PacBio platform. It is therefore suitable for studying complex SVs, especially those involving LCRs, inversions, and the generation of chimeric Alu elements at the breakpoints. Other genomic research applications, such as haplotype phasing and small insertion and deletion validation could also benefit from this technology.
长(>5kb)DNA测序读数的生成提供了一种探究人类基因组复杂区域的方法。目前,太平洋生物科学公司(PacBio)的大插入片段全基因组测序(WGS)技术能够分析染色体结构变异(SVs),但在整个人类基因组上实现所需序列覆盖的成本很高。
我们开发了一种方法(称为PacBio-LITS),该方法将基于寡核苷酸的DNA靶标捕获富集技术与PacBio大插入片段文库制备相结合,以促进在特定染色体区域进行SV研究。与PacBio WGS相比,PacBio-LITS以大幅降低的成本在指定位点提供了深度序列覆盖。通过描绘诊断为波托基-卢普斯基综合征(PTLS;MIM#610883)的患者chr17p11.2上低拷贝重复序列(LCR)相关复杂结构重排的断点连接,说明了PacBio-LITS的功效。我们在三例PTLS病例中成功鉴定出先前确定的断点连接,并且还能够在重复序列中发现新的连接,包括LCR介导的断点。这些新信息使我们能够提出这些结构变异形成的机制。
这种新方法利用了靶向捕获测序的成本效益以及PacBio平台生成的长测序读数的可映射性和支架搭建能力。因此,它适用于研究复杂的SVs,特别是那些涉及LCRs、倒位以及断点处嵌合Alu元件生成的SVs。其他基因组研究应用,如单倍型分型和小插入缺失验证也可能受益于这项技术。