Spies Noah, Weng Ziming, Bishara Alex, McDaniel Jennifer, Catoe David, Zook Justin M, Salit Marc, West Robert B, Batzoglou Serafim, Sidow Arend
Genome-scale Measurements Group, National Institute of Standards and Technology, Gaithersburg, Maryland, USA.
Joint Initiative for Metrology in Biology, Stanford, California, USA.
Nat Methods. 2017 Sep;14(9):915-920. doi: 10.1038/nmeth.4366. Epub 2017 Jul 17.
In read cloud approaches, microfluidic partitioning of long genomic DNA fragments and barcoding of shorter fragments derived from these fragments retains long-range information in short sequencing reads. This combination of short reads with long-range information represents a powerful alternative to single-molecule long-read sequencing. We develop Genome-wide Reconstruction of Complex Structural Variants (GROC-SVs) for SV detection and assembly from read cloud data and apply this method to Illumina-sequenced 10x Genomics sarcoma and breast cancer data sets. Compared with short-fragment sequencing, GROC-SVs substantially improves the specificity of breakpoint detection at comparable sensitivity. This approach also performs sequence assembly across multiple breakpoints simultaneously, enabling the reconstruction of events exhibiting remarkable complexity. We show that chromothriptic rearrangements occurred before copy number amplifications, and that rates of single-nucleotide variants and SVs are not correlated. Our results support the use of read cloud approaches to advance the characterization of large and complex structural variation.
在读取云方法中,长基因组DNA片段的微流体分区以及源自这些片段的较短片段的条形码编码,在短测序读数中保留了长程信息。这种短读数与长程信息的结合代表了单分子长读数测序的一种强大替代方法。我们开发了用于从读取云数据中检测和组装结构变异(SV)的全基因组复杂结构变异重建(GROC-SV)方法,并将该方法应用于Illumina测序的10x基因组学肉瘤和乳腺癌数据集。与短片段测序相比,GROC-SV在可比的灵敏度下显著提高了断点检测的特异性。该方法还能同时跨越多个断点进行序列组装,从而能够重建具有显著复杂性的事件。我们表明,染色体碎裂重排在拷贝数扩增之前发生,并且单核苷酸变异和SV的发生率不相关。我们的结果支持使用读取云方法来推进对大型和复杂结构变异的表征。