Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK.
Bioinformatics. 2011 May 1;27(9):1195-200. doi: 10.1093/bioinformatics/btr107. Epub 2011 Feb 25.
The careful normalization of array-based comparative genomic hybridization (aCGH) data is of critical importance for the accurate detection of copy number changes. The difference in labelling affinity between the two fluorophores used in aCGH-usually Cy5 and Cy3-can be observed as a bias within the intensity distributions. If left unchecked, this bias is likely to skew data interpretation during downstream analysis and lead to an increased number of false discoveries.
In this study, we have developed aCGH.Spline, a natural cubic spline interpolation method followed by linear interpolation of outlier values, which is able to remove a large portion of the dye bias from large aCGH datasets in a quick and efficient manner.
We have shown that removing this bias and reducing the experimental noise has a strong positive impact on the ability to detect accurately both copy number variation (CNV) and copy number alterations (CNA).
基于阵列的比较基因组杂交(aCGH)数据的精细归一化对于准确检测拷贝数变化至关重要。aCGH 中使用的两种荧光染料(通常是 Cy5 和 Cy3)之间的标记亲和力差异可以在强度分布中观察到一个偏差。如果不加以检查,这种偏差很可能会在下游分析中扭曲数据解释,并导致假阳性发现的数量增加。
在这项研究中,我们开发了 aCGH.Spline,这是一种自然三次样条插值方法,随后对异常值进行线性插值,它能够快速有效地从大型 aCGH 数据集中去除大部分染料偏差。
我们已经表明,去除这种偏差和减少实验噪声对准确检测拷贝数变异(CNV)和拷贝数改变(CNA)的能力有很强的积极影响。