Li Xingyuan, He Zhili, Zhou Jizhong
PerkinElmer Life and Analytical Sciences, 549 Albany Street, Boston, MA 02118, USA.
Nucleic Acids Res. 2005 Oct 24;33(19):6114-23. doi: 10.1093/nar/gki914. Print 2005.
The oligonucleotide specificity for microarray hybridization can be predicted by its sequence identity to non-targets, continuous stretch to non-targets, and/or binding free energy to non-targets. Most currently available programs only use one or two of these criteria, which may choose 'false' specific oligonucleotides or miss 'true' optimal probes in a considerable proportion. We have developed a software tool, called CommOligo using new algorithms and all three criteria for selection of optimal oligonucleotide probes. A series of filters, including sequence identity, free energy, continuous stretch, GC content, self-annealing, distance to the 3'-untranslated region (3'-UTR) and melting temperature (T(m)), are used to check each possible oligonucleotide. A sequence identity is calculated based on gapped global alignments. A traversal algorithm is used to generate alignments for free energy calculation. The optimal T(m) interval is determined based on probe candidates that have passed all other filters. Final probes are picked using a combination of user-configurable piece-wise linear functions and an iterative process. The thresholds for identity, stretch and free energy filters are automatically determined from experimental data by an accessory software tool, CommOligo_PE (CommOligo Parameter Estimator). The program was used to design probes for both whole-genome and highly homologous sequence data. CommOligo and CommOligo_PE are freely available to academic users upon request.
微阵列杂交的寡核苷酸特异性可通过其与非靶标的序列同一性、与非靶标的连续延伸以及和/或与非靶标的结合自由能来预测。目前大多数可用程序仅使用这些标准中的一两条,这可能会选择“错误”的特异性寡核苷酸,或者在相当大比例的情况下遗漏“真正”的最佳探针。我们开发了一种名为CommOligo的软件工具,它使用新算法并结合所有三条标准来选择最佳寡核苷酸探针。一系列过滤器,包括序列同一性、自由能、连续延伸、GC含量、自我退火、到3'非翻译区(3'-UTR)的距离以及解链温度(T(m)),用于检查每个可能的寡核苷酸。基于有间隙的全局比对来计算序列同一性。使用遍历算法生成用于自由能计算的比对。基于通过所有其他过滤器的候选探针来确定最佳T(m)区间。最终探针通过用户可配置的分段线性函数和迭代过程相结合的方式挑选出来。同一性、延伸和自由能过滤器的阈值由辅助软件工具CommOligo_PE(CommOligo参数估计器)根据实验数据自动确定。该程序用于为全基因组和高度同源序列数据设计探针。学术用户可根据要求免费获得CommOligo和CommOligo_PE。