The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, Ontario, Canada.
Nat Biotechnol. 2011 May 8;29(6):512-20. doi: 10.1038/nbt.1852.
We have systematically compared copy number variant (CNV) detection on eleven microarrays to evaluate data quality and CNV calling, reproducibility, concordance across array platforms and laboratory sites, breakpoint accuracy and analysis tool variability. Different analytic tools applied to the same raw data typically yield CNV calls with <50% concordance. Moreover, reproducibility in replicate experiments is <70% for most platforms. Nevertheless, these findings should not preclude detection of large CNVs for clinical diagnostic purposes because large CNVs with poor reproducibility are found primarily in complex genomic regions and would typically be removed by standard clinical data curation. The striking differences between CNV calls from different platforms and analytic tools highlight the importance of careful assessment of experimental design in discovery and association studies and of strict data curation and filtering in diagnostics. The CNV resource presented here allows independent data evaluation and provides a means to benchmark new algorithms.
我们系统地比较了十一种微阵列上的拷贝数变异 (CNV) 检测,以评估数据质量和 CNV 调用、重现性、不同阵列平台和实验室之间的一致性、断点准确性以及分析工具的可变性。不同的分析工具应用于相同的原始数据通常会产生 <50%的 CNV 调用一致性。此外,大多数平台的重复实验的重现性<70%。然而,这些发现不应排除用于临床诊断目的的大 CNV 检测,因为具有低重现性的大 CNV 主要存在于复杂的基因组区域中,通常会被标准的临床数据管理所去除。来自不同平台和分析工具的 CNV 调用之间的显著差异突出表明,在发现和关联研究中,需要仔细评估实验设计,在诊断中需要严格的数据管理和过滤。这里提供的 CNV 资源允许对数据进行独立评估,并为基准测试新算法提供了一种手段。