Seringhaus Michael, Rozowsky Joel, Royce Thomas, Nagalakshmi Ugrappa, Jee Justin, Snyder Michael, Gerstein Mark
Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA.
BMC Genomics. 2008 Dec 31;9:635. doi: 10.1186/1471-2164-9-635.
Mismatched oligonucleotides are widely used on microarrays to differentiate specific from nonspecific hybridization. While many experiments rely on such oligos, the hybridization behavior of various degrees of mismatch (MM) structure has not been extensively studied. Here, we present the results of two large-scale microarray experiments on S. cerevisiae and H. sapiens genomic DNA, to explore MM oligonucleotide behavior with real sample mixtures under tiling-array conditions.
We examined all possible nucleotide substitutions at the central position of 36-nucleotide probes, and found that nonspecific binding by MM oligos depends upon the individual nucleotide substitutions they incorporate: C-->A, C-->G and T-->A (yielding purine-purine mispairs) are most disruptive, whereas A-->X were least disruptive. We also quantify a marked GC skew effect: substitutions raising probe GC content exhibit higher intensity (and vice versa). This skew is small in highly-expressed regions (+/- 0.5% of total intensity range) and large (+/- 2% or more) elsewhere. Multiple mismatches per oligo are largely additive in effect: each MM added in a distributed fashion causes an additional 21% intensity drop relative to PM, three-fold more disruptive than adding adjacent mispairs (7% drop per MM).
We investigate several parameters for oligonucleotide design, including the effects of each central nucleotide substitution on array signal intensity and of multiple MM per oligo. To avoid GC skew, individual substitutions should not alter probe GC content. RNA sample mixture complexity may increase the amount of nonspecific hybridization, magnify GC skew and boost the intensity of MM oligos at all levels.
错配寡核苷酸在微阵列中被广泛用于区分特异性杂交和非特异性杂交。虽然许多实验依赖于此类寡核苷酸,但不同程度错配(MM)结构的杂交行为尚未得到广泛研究。在此,我们展示了两项针对酿酒酵母和人类基因组DNA的大规模微阵列实验结果,以探讨平铺阵列条件下真实样品混合物中MM寡核苷酸的行为。
我们检查了36个核苷酸探针中心位置的所有可能核苷酸替换,发现MM寡核苷酸的非特异性结合取决于它们所包含的单个核苷酸替换:C→A、C→G和T→A(产生嘌呤-嘌呤错配)最具干扰性,而A→X的干扰性最小。我们还量化了显著的GC偏差效应:提高探针GC含量的替换表现出更高的强度(反之亦然)。这种偏差在高表达区域较小(占总强度范围的±0.5%),而在其他地方较大(±2%或更多)。每个寡核苷酸的多个错配在很大程度上具有累加效应:以分散方式添加的每个MM相对于完美匹配(PM)会导致强度额外下降21%,比添加相邻错配的干扰性大三倍(每个MM下降7%)。
我们研究了寡核苷酸设计的几个参数,包括每个中心核苷酸替换对阵列信号强度的影响以及每个寡核苷酸多个MM的影响。为避免GC偏差,单个替换不应改变探针的GC含量。RNA样品混合物的复杂性可能会增加非特异性杂交的量,放大GC偏差并提高所有水平下MM寡核苷酸的强度。