Institute of Medical Informatics, University of Münster, Albert-Schweitzer-Campus 1, Building A11, Münster 48149, Germany.
Laboratory Hematology, RadboudUMC, Geert Grooteplein Zuid 10, Nijmegen 6525 GA, Netherlands.
Gigascience. 2020 Nov 2;9(11). doi: 10.1093/gigascience/giaa118.
Copy number variants (CNVs) are known to play an important role in the development and progression of several diseases. However, detection of CNVs with whole-exome sequencing (WES) experiments is challenging. Usually, additional experiments have to be performed.
We developed a novel algorithm for somatic CNV calling in matched WES data called "CopyDetective". Different from other approaches, CNV calling with CopyDetective consists of a 2-step procedure: first, quality analysis is performed, determining individual detection thresholds for every sample. Second, actual CNV calling on the basis of the previously determined thresholds is performed. Our algorithm evaluates the change in variant allele frequency of polymorphisms and reports the fraction of affected cells for every CNV. Analyzing 4 WES data sets (n = 100) we observed superior performance of CopyDetective compared with ExomeCNV, VarScan2, ControlFREEC, ExomeDepth, and CNV-seq.
Individual detection thresholds reveal that not every WES data set is equally apt for CNV calling. Initial quality analyses, determining individual detection thresholds-as realized by CopyDetective-can and should be performed prior to actual variant calling.
拷贝数变异(CNVs)已知在多种疾病的发生和发展中发挥着重要作用。然而,使用全外显子组测序(WES)实验检测 CNVs 具有挑战性。通常,必须进行额外的实验。
我们开发了一种称为“CopyDetective”的新型算法,用于匹配的 WES 数据中的体细胞 CNV 调用。与其他方法不同,使用 CopyDetective 进行 CNV 调用包括两步过程:首先,进行质量分析,为每个样本确定个体检测阈值。其次,根据先前确定的阈值进行实际的 CNV 调用。我们的算法评估了多态性的变异等位基因频率的变化,并为每个 CNV 报告受影响细胞的分数。在分析了 4 个 WES 数据集(n=100)后,我们观察到 CopyDetective 与 ExomeCNV、VarScan2、ControlFREEC、ExomeDepth 和 CNV-seq 相比具有更好的性能。
个体检测阈值表明并非每个 WES 数据集都同样适合 CNV 调用。初始质量分析,确定个体检测阈值——如 CopyDetective 所实现的——可以而且应该在实际的变体调用之前进行。