Zheng Jie, Svensson Jan T, Madishetty Kavitha, Close Timothy J, Jiang Tao, Lonardi Stefano
Department of Computer Science & Engineering, University of California, Riverside, CA 92521, USA.
BMC Bioinformatics. 2006 Jan 9;7:7. doi: 10.1186/1471-2105-7-7.
Expressed sequence tag (EST) datasets represent perhaps the largest collection of genetic information. ESTs can be exploited in a variety of biological experiments and analysis. Here we are interested in the design of overlapping oligonucleotide (overgo) probes from large unigene (EST-contigs) datasets.
OLIGOSPAWN is a suite of software tools that offers two complementary services, namely (1) the selection of "unique" oligos each of which appears in one unigene but does not occur (exactly or approximately) in any other and (2) the selection of "popular" oligos each of which occurs (exactly or approximately) in as many unigenes as possible. In this paper, we describe the functionalities of OLIGOSPAWN and the computational methods it employs, and we report on experimental results for the overgo probes designed with it.
The algorithms we designed are highly efficient and capable of processing unigene datasets of sizes on the order of several tens of Mb in a few hours on a regular PC. The software has been used to design overgo probes employed to screen a barley BAC library (Hordeum vulgare). OLIGOSPAWN is freely available at http://oligospawn.ucr.edu/.
表达序列标签(EST)数据集可能是最大的遗传信息集合。EST可用于各种生物学实验和分析。在此,我们关注从大型单基因(EST重叠群)数据集中设计重叠寡核苷酸(overgo)探针。
OLIGOSPAWN是一套软件工具,提供两种互补服务,即(1)选择“独特”寡核苷酸,每个寡核苷酸在一个单基因中出现,但不在任何其他单基因中(精确或近似)出现;(2)选择“常见”寡核苷酸,每个寡核苷酸在尽可能多的单基因中(精确或近似)出现。在本文中,我们描述了OLIGOSPAWN的功能及其采用的计算方法,并报告了用它设计的overgo探针的实验结果。
我们设计的算法效率很高,能够在普通个人电脑上几小时内处理大小约为几十兆字节的单基因数据集。该软件已用于设计用于筛选大麦BAC文库(大麦)的overgo探针。OLIGOSPAWN可从http://oligospawn.ucr.edu/免费获得。