IEEE/ACM Trans Comput Biol Bioinform. 2017 Nov-Dec;14(6):1237-1250. doi: 10.1109/TCBB.2016.2576441. Epub 2016 Jun 7.
In the midst of the important genomic variants associated to the susceptibility and resistance to complex diseases, Copy Number Variations (CNV) has emerged as a prevalent class of structural variation. Following the flood of next-generation sequencing data, numerous tools publicly available have been developed to provide computational strategies to identify CNV at improved accuracy. This review goes beyond scrutinizing the main approaches widely used for structural variants detection in general, including Split-Read, Paired-End Mapping, Read-Depth, and Assembly-based. In this paper, (1) we characterize the relevant technical details around the detection of CNV, which can affect the estimation of breakpoints and number of copies, (2) we pinpoint the most important insights related to GC-content and mappability biases, and (3) we discuss the paramount caveats in the tools evaluation process. The points brought out in this study emphasize common assumptions, a variety of possible limitations, valuable insights, and directions for desirable contributions to the state-of-the-art in CNV detection tools.
在与复杂疾病易感性和抗性相关的重要基因组变异中,拷贝数变异 (CNV) 已经成为一种常见的结构变异类型。随着下一代测序数据的涌现,已经开发出许多可公开获取的工具,以提供计算策略来提高 CNV 识别的准确性。本综述不仅深入研究了一般结构变异检测中广泛使用的主要方法,包括 Split-Read、Paired-End Mapping、Read-Depth 和基于组装的方法。在本文中,(1)我们描述了影响断点和拷贝数估计的 CNV 检测相关技术细节,(2)我们指出了与 GC 含量和可映射性偏差相关的最重要的见解,(3)我们讨论了工具评估过程中的主要注意事项。本研究中提出的观点强调了常见的假设、各种可能的局限性、有价值的见解以及对 CNV 检测工具的最新技术发展的有益贡献的方向。