The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA.
Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, USA.
Genome Biol. 2018 Mar 20;19(1):38. doi: 10.1186/s13059-018-1404-6.
Comprehensive and accurate identification of structural variations (SVs) from next generation sequencing data remains a major challenge. We develop FusorSV, which uses a data mining approach to assess performance and merge callsets from an ensemble of SV-calling algorithms. It includes a fusion model built using analysis of 27 deep-coverage human genomes from the 1000 Genomes Project. We identify 843 novel SV calls that were not reported by the 1000 Genomes Project for these 27 samples. Experimental validation of a subset of these calls yields a validation rate of 86.7%. FusorSV is available at https://github.com/TheJacksonLaboratory/SVE .
从下一代测序数据中全面准确地识别结构变异(SV)仍然是一个主要挑战。我们开发了 FusorSV,它使用数据挖掘方法来评估性能,并合并来自一组 SV 调用算法的调用集。它包括一个使用来自 1000 基因组计划的 27 个人类深度覆盖基因组的分析构建的融合模型。我们鉴定了这 27 个样本中 1000 基因组计划未报告的 843 个新的 SV 调用。对这些调用的一部分进行实验验证,得到了 86.7%的验证率。FusorSV 可在 https://github.com/TheJacksonLaboratory/SVE 获得。