Sharp Andrew J, Itsara Andy, Cheng Ze, Alkan Can, Schwartz Stuart, Eichler Evan E
Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
Hum Mol Genet. 2007 Nov 15;16(22):2770-9. doi: 10.1093/hmg/ddm234. Epub 2007 Aug 28.
Copy-number variants (CNVs) occur frequently within the human genome, and may be associated with many human phenotypes. If disease association studies of CNVs are to be performed routinely, it is essential that the copy-number status be accurately genotyped. We systematically assessed the dynamic range response of an oligonucleotide microarray platform to accurately predict copy-number in a set of seven patients who had previously been shown to carry between 1 and 6 copies of an approximately 4 Mb region of 15q12.2-q13.1. We identify probe uniqueness, probe length, uniformity of probe melting temperature, overlap with SNPs and common repeats (particularly Alu elements) and guanine homopolymer content as parameters that significantly affect probe performance. Further, we prove the influence of these criteria on array performance by using these parameters to prospectively filter data from a second array design covering an independent genomic region and observing significant improvements in data quality. The informed selection of probes which have superior performance characteristics allows the prospective design of oligonucleotide arrays which show increased sensitivity and specificity compared with current designs. Although based on the analysis of data from comparative genomic hybridization experiments, we anticipate that our results are relevant to the design of improved oligonucleotide arrays for high-throughput copy-number genotyping of complex regions of the human genome.
拷贝数变异(CNV)在人类基因组中频繁出现,并且可能与许多人类表型相关。如果要常规开展CNV的疾病关联研究,准确地对拷贝数状态进行基因分型至关重要。我们系统评估了一个寡核苷酸微阵列平台的动态范围响应,以准确预测一组7名患者的拷贝数,这些患者先前已被证明在15q12.2 - q13.1约4 Mb区域携带1至6个拷贝。我们确定了探针唯一性、探针长度、探针解链温度的均匀性、与单核苷酸多态性(SNP)和常见重复序列(特别是Alu元件)的重叠以及鸟嘌呤同聚物含量等参数,这些参数会显著影响探针性能。此外,我们通过使用这些参数对来自覆盖独立基因组区域的第二种阵列设计的数据进行前瞻性过滤,并观察到数据质量有显著改善,从而证明了这些标准对阵列性能的影响。明智地选择具有卓越性能特征的探针,能够前瞻性地设计出与当前设计相比具有更高灵敏度和特异性的寡核苷酸阵列。尽管基于比较基因组杂交实验的数据分析,但我们预计我们的结果与改进寡核苷酸阵列的设计相关,用于人类基因组复杂区域的高通量拷贝数基因分型。