Human Genome Sequencing Center, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX, 77030, USA.
Department of Molecular and Human Genetics, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX, 77030, USA.
BMC Genomics. 2017 Oct 3;18(Suppl 6):691. doi: 10.1186/s12864-017-4021-y.
Characterization of genomic structural variation (SV) is essential to expanding the research and clinical applications of genome sequencing. Reliance upon short DNA fragment paired end sequencing has yielded a wealth of single nucleotide variants and internal sequencing read insertions-deletions, at the cost of limited SV detection. Multi-kilobase DNA fragment mate pair sequencing has supplemented the void in SV detection, but introduced new analytic challenges requiring SV detection tools specifically designed for mate pair sequencing data. Here, we introduce SVachra - Structural Variation Assessment of CHRomosomal Aberrations, a breakpoint calling program that identifies large insertions-deletions, inversions, inter- and intra-chromosomal translocations utilizing both inward and outward facing read types generated by mate pair sequencing.
We demonstrate SVachra's utility by executing the program on large-insert (Illumina Nextera) mate pair sequencing data from the personal genome of a single subject (HS1011). An additional data set of long-read (Pacific BioSciences RSII) was also generated to validate SV calls from SVachra and other comparison SV calling programs. SVachra exhibited the highest validation rate and reported the widest distribution of SV types and size ranges when compared to other SV callers.
SVachra is a highly specific breakpoint calling program that exhibits a more unbiased SV detection methodology than other callers.
基因组结构变异 (SV) 的特征对于扩展基因组测序的研究和临床应用至关重要。依赖短 DNA 片段配对末端测序已经产生了大量的单核苷酸变体和内部测序读段的插入-缺失,但代价是 SV 检测的有限性。多千碱基 DNA 片段配对测序补充了 SV 检测的空白,但引入了新的分析挑战,需要专门为配对测序数据设计的 SV 检测工具。在这里,我们介绍了 SVachra——染色体畸变的结构变异评估,这是一种断点调用程序,利用配对测序产生的内向外读取类型来识别大片段插入-缺失、倒位、染色体间和染色体内易位。
我们通过在单个个体 (HS1011) 的个人基因组的大插入 (Illumina Nextera) 配对测序数据上执行该程序,证明了 SVachra 的实用性。还生成了一个长读 (Pacific BioSciences RSII) 的数据集,以验证 SVachra 和其他比较 SV 调用程序的 SV 调用。与其他 SV 调用程序相比,SVachra 表现出最高的验证率,并报告了最广泛的 SV 类型和大小分布。
SVachra 是一种高度特异性的断点调用程序,与其他调用程序相比,它表现出更无偏的 SV 检测方法。