Layer Ryan M, Chiang Colby, Quinlan Aaron R, Hall Ira M
Genome Biol. 2014 Jun 26;15(6):R84. doi: 10.1186/gb-2014-15-6-r84.
Comprehensive discovery of structural variation (SV) from whole genome sequencing data requires multiple detection signals including read-pair, split-read, read-depth and prior knowledge. Owing to technical challenges, extant SV discovery algorithms either use one signal in isolation, or at best use two sequentially. We present LUMPY, a novel SV discovery framework that naturally integrates multiple SV signals jointly across multiple samples. We show that LUMPY yields improved sensitivity, especially when SV signal is reduced owing to either low coverage data or low intra-sample variant allele frequency. We also report a set of 4,564 validated breakpoints from the NA12878 human genome. https://github.com/arq5x/lumpy-sv.
从全基因组测序数据中全面发现结构变异(SV)需要多种检测信号,包括读对、分裂读、读深度和先验知识。由于技术挑战,现有的SV发现算法要么单独使用一种信号,要么最多依次使用两种信号。我们提出了LUMPY,这是一种新颖的SV发现框架,它能自然地跨多个样本联合整合多种SV信号。我们表明,LUMPY具有更高的灵敏度,尤其是当由于低覆盖数据或低样本内变异等位基因频率导致SV信号减少时。我们还报告了一组来自NA12878人类基因组的4564个经过验证的断点。https://github.com/arq5x/lumpy-sv