Lechner-Haag Genomics Core, Vaccine Research Institute of San Diego, 10835 Road to the Cure, Suite 150, San Diego, CA 92121, USA.
Nucleic Acids Res. 2010 Jun;38(11):e121. doi: 10.1093/nar/gkq039. Epub 2010 Mar 17.
Most current microarray oligonucleotide probe design strategies are based on probe design factors (PDFs), which include probe hybridization free energy (PHFE), probe minimum folding energy (PMFE), dimer score, hairpin score, homology score and complexity score. The impact of these PDFs on probe performance was evaluated using four sets of microarray comparative genome hybridization (aCGH) data, which included two array manufacturing methods and the genomes of two species. Since most of the hybridizing DNA is equimolar in CGH data, such data are ideal for testing the general hybridization properties of almost all candidate oligonucleotides. In all our data sets, PDFs related to probe secondary structure (PMFE, hairpin score and dimer score) are the most significant factors linearly correlated with probe hybridization intensities. PHFE, homology and complexity score are correlating significantly with probe specificities, but in a non-linear fashion. We developed a new PDF, pseudo probe binding energy (PPBE), by iteratively fitting dinucleotide positional weights and dinucleotide stacking energies until the average residue sum of squares for the model was minimized. PPBE showed a better correlation with probe sensitivity and a better specificity than all other PDFs, although training data are required to construct a PPBE model prior to designing new oligonucleotide probes. The physical properties that are measured by PPBE are as yet unknown but include a platform-dependent component. A practical way to use these PDFs for probe design is to set cutoff thresholds to filter out bad quality probes. Programs and correlation parameters from this study are freely available to facilitate the design of DNA microarray oligonucleotide probes.
目前大多数微阵列寡核苷酸探针设计策略都是基于探针设计因素(PDFs),包括探针杂交自由能(PHFE)、探针最小折叠能(PMFE)、二聚体得分、发夹得分、同源性得分和复杂度得分。使用四组微阵列比较基因组杂交(aCGH)数据评估了这些 PDFs 对探针性能的影响,这些数据集包括两种阵列制造方法和两种物种的基因组。由于 CGH 数据中大多数杂交的 DNA 都是等摩尔的,因此这种数据非常适合测试几乎所有候选寡核苷酸的一般杂交特性。在我们所有的数据集,与探针二级结构(PMFE、发夹得分和二聚体得分)相关的 PDF 是与探针杂交强度线性相关的最重要因素。PHFE、同源性和复杂度得分与探针特异性显著相关,但呈非线性关系。我们通过迭代拟合二核苷酸位置权重和二核苷酸堆积能来开发新的 PDF,即伪探针结合能(PPBE),直到模型的平均残基均方和最小化。与所有其他 PDF 相比,PPBE 与探针灵敏度的相关性更好,特异性也更好,尽管在设计新的寡核苷酸探针之前需要使用训练数据构建 PPBE 模型。由 PPBE 测量的物理性质尚不清楚,但包括一个平台相关的组成部分。使用这些 PDF 进行探针设计的一种实用方法是设置截止阈值来过滤掉质量差的探针。本研究中的程序和相关参数可免费提供,以方便 DNA 微阵列寡核苷酸探针的设计。