Dawes Joanna C, Webster Philip, Iadarola Barbara, Garcia-Diaz Claudia, Dore Marian, Bolt Bruce J, Dewchand Hamlata, Dharmalingam Gopuraja, McLatchie Alex P, Kaczor Jakub, Caceres Juan J, Paccanaro Alberto, Game Laurence, Parrinello Simona, Uren Anthony G
1MRC London Institute of Medical Sciences (LMS), Du Cane Road, London, W12 0NN UK.
2Institute of Clinical Sciences (ICS), Faculty of Medicine, Imperial College London, Du Cane Road, London, UK.
Mob DNA. 2020 Feb 4;11:7. doi: 10.1186/s13100-020-0201-4. eCollection 2020.
Ligation-mediated PCR protocols have diverse uses including the identification of integration sites of insertional mutagens, integrating vectors and naturally occurring mobile genetic elements. For approaches that employ NGS sequencing, the relative abundance of integrations within a complex mixture is typically determined through the use of read counts or unique fragment lengths from a ligation of sheared DNA; however, these estimates may be skewed by PCR amplification biases and saturation of sequencing coverage.
Here we describe a modification of our previous splinkerette based ligation-mediated PCR using a novel Illumina-compatible adapter design that prevents amplification of non-target DNA and incorporates unique molecular identifiers. This design reduces the number of PCR cycles required and improves relative quantitation of integration abundance for saturating sequencing coverage. By inverting the forked adapter strands from a standard orientation, the integration-genome junction can be sequenced without affecting the sequence diversity required for cluster generation on the flow cell. Replicate libraries of murine leukemia virus-infected spleen samples yielded highly reproducible quantitation of clonal integrations as well as a deep coverage of subclonal integrations. A dilution series of DNAs bearing integrations of MuLV or piggyBac transposon shows linearity of the quantitation over a range of concentrations.
Merging ligation and library generation steps can reduce total PCR amplification cycles without sacrificing coverage or fidelity. The protocol is robust enough for use in a 96 well format using an automated liquid handler and we include programs for use of a Beckman Biomek liquid handling workstation. We also include an informatics pipeline that maps reads, builds integration contigs and quantitates integration abundance using both fragment lengths and unique molecular identifiers. Suggestions for optimizing the protocol to other target DNA sequences are included. The reproducible distinction of clonal and subclonal integration sites from each other allows for analysis of populations of cells undergoing selection, such as those found in insertional mutagenesis screens.
连接介导的PCR方案有多种用途,包括鉴定插入诱变剂、整合载体和天然存在的可移动遗传元件的整合位点。对于采用NGS测序的方法,复杂混合物中整合的相对丰度通常通过使用剪切DNA连接后的读数计数或独特片段长度来确定;然而,这些估计可能会因PCR扩增偏差和测序覆盖饱和度而产生偏差。
在此,我们描述了对我们之前基于拼接体的连接介导PCR的一种改进,使用了一种新颖的与Illumina兼容的接头设计,该设计可防止非靶DNA的扩增并纳入独特分子标识符。这种设计减少了所需的PCR循环数,并改善了饱和测序覆盖下整合丰度的相对定量。通过将叉状接头链从标准方向反转,可以对整合-基因组连接进行测序,而不会影响在流动池上产生簇所需的序列多样性。小鼠白血病病毒感染的脾脏样本的重复文库产生了高度可重复的克隆整合定量以及亚克隆整合的深度覆盖。一系列携带莫洛尼氏鼠白血病病毒(MuLV)或猪尾巴(piggyBac)转座子整合的DNA稀释液在一定浓度范围内显示出定量的线性关系。
合并连接和文库生成步骤可以减少总的PCR扩增循环,而不会牺牲覆盖度或保真度。该方案足够稳健,可使用自动液体处理仪以96孔板形式使用,我们还提供了使用贝克曼Biomek液体处理工作站的程序。我们还包括一个信息学流程,该流程可映射读数、构建整合重叠群并使用片段长度和独特分子标识符对整合丰度进行定量。文中还给出了将该方案优化用于其他靶DNA序列的建议。克隆和亚克隆整合位点之间可重复的区分允许对经历选择的细胞群体进行分析,例如在插入诱变筛选中发现的细胞群体。